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Abstract 


A  Wavelet  Fractal  Method  For  Content  Based 
Image  and  Video  Compression 

Robert  J.  Bonneau 

Traditional  methods  of  image  and  video  coding  rely  on  linear  transformations  that  focus 
primarily  on  high  compression.  With  the  increasing  demand  for  digital  imagery  and  video  there  is 
now  a  need  for  functionality  of  the  compressed  information.  This  dissertation  develops  a  new 
framework  for  compression  that  uses  a  fractal  wavelet  method  to  break  the  imagery  into  shape, 
texture,  color,  and  motion.  With  this  new  organization,  image  information  is  readily  accessible  to 
the  user  in  compressed  form.  Based  on  this  compression  method,  we  then  develop  an  object-ori¬ 
ented  video  format. 

Image  analysts  tend  to  break  imagery  into  the  categories  of  shape,  texture,  color,  and 

motion.  Thus,  we  begin  our  approach  to  image  compression  by  finding  mathematical  methods 
that  preserve  shape  and  texture  in  an  efficient  manner.  This  new  non-traditional  method  begins  by 
using  fractals.  A  fractal  is  an  object  which  when  observed  at  its  smallest  level  of  detail  resembles 
the  overall  object  itself.  Some  natural  examples  include  ferns,  snowflakes,  clouds,  and  moun¬ 
tains.  Recently,  engineers  have  applied  fixed  point  theory  to  describe  a  method  of  fractal  image 
compression  .  Unfortunately  fixed  point  theory  only  provides  a  partial  description  of  fractal  com¬ 
pression,  since  it  says  little  about  the  spatial  frequency  structure  behind  the  process. 

For  insight  into  the  frequency  structure  of  fractal  compression,  engineers  and  scientists 
have  recently  turned  to  wavelets,  since  wavelet  methods  mirror  the  fractal  compression  process. 
Understanding  the  frequency  structure  enables  us  to  see  how  such  a  shape  and  texture  work 


together  in  the  fractal  compression  process.  With  this  new  insight  into  fractal  compression,  we 
may  design  a  much  more  efficient  compression  system.  Using  this  new  compression  process  we 
also  include  color  and  motion.  The  system  developed  can  then  be  used  for  compression  of  both 
imagry  and  video  as  well  as  analysis  and  comparison  of  the  compressed  information 
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2 

1.1  Motivation 

The  speed  of  information  exchange  has  increased  dramatically  over  the  last  several 
decades.  This  speed  has  brought  with  it  increased  need  for  better  compression  of  everything  from 
photographs  to  movies.  In  order  to  meet  this  demand  we  look  to  new  methods  to  decompose 
imagery  into  an  extremely  compact  form  that  can  be  analyzed  for  its  content.  Recently  interna¬ 
tional  committees  have  convened  to  promote  the  new  video  and  image  compression  standards. 
These  committees  have  outlined  features  in  compression  that  are  desirable  for  new  compression 
methods.  We  will  first  review  the  existing  standards  and  then  look  at  the  proposed  new  directions. 

The  existing  standard  has  been  based  on  the  JPEG  image  compression  format  which  uses 
the  discrete  cosine  transform  as  a  fundamental  technique  of  compression.  The  DCT  is  a  linear 
transform  with  excellent  compression  properties.  This  compression  standard  has  focused  on  high 
compression  with  accurate  reproduction  as  its  main  criteria.  The  MPEGl  and  2  video  standards 
have  continued  this  tradition  using  block  based  DCT  methods  as  their  foundation  with  provisions 
for  high  resolution,  spatial,  temporal,  and  SNR  scalablity  and  a  host  of  other  features  designed  for 
improvement  of  speed  in  encoding  and  decoding  of  video. 

Recently  the  MPEG  4  requirements  have  changed  from  traditional  techniques  to  less  tradi¬ 
tional  non-DCT  based  methods.  With  the  growing  emphasis  on  wireless  communications,  new 
avenues  for  compression  have  been  explored  such  as  wavelet,  vector  quantization,  and  fractal 
techniques.  These  new  techniques  have  promised  higher  compression  ratios  with  less  image  dis¬ 
tortion.  These  new  methods  have  also  focused  on  the  trend  for  interactive  computer  applications 
where  objects  in  scenes  are  defined  much  as  a  human  would  describe  them.  An  example  might  be 
a  car  separated  from  a  building  separated  from  a  human  as  individual  objects  within  a  scene. 
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Continuing  the  trend  of  object  based  imagery  and  video  MPEG-7’s  goals  are  to  estab¬ 
lish  a  new  compression  format  where  semantic  information  such  as  color,  texture,  and  shape  can 
be  easily  extracted  from  image  and  video  in  the  compressed  domain.  Such  information  should  be 
used  for  identification  of  objects  and  their  relative  position  and  direction  if  they  are  in  motion. 

1.2  Some  Basic  Elements  of  Imagery  and  Video 

Imagery  and  video  as  the  human  visual  system  interprets  them  can  be  broken  into  4 
basic  elements,  shape,  color,  texture,  and  motion.  These  elements  must  be  an  integral  part 
of  the  compression  process  in  an  object-based  interpretive  system.  Shape  has  one  of  the 
most  difficult  techniques  of  pattern  recognition  because  finding  an  invariant  measure  of 
shape  is  a  challenging  process.  A  basic  tasks  of  shape  analysis  is  to  find  a  reliable  edge 
outline  of  an  object.  New  techniques  in  wavelet  multiresolution  analysis  are  simplifying 
this  process  and  are  making  content  based  shape  identification  more  practical. 

Texture  is  also  becoming  an  increasingly  used  technique  of  image  segmentation.  Texture 
is  in  increasingly  useful  when  there  is  no  color  information  and  an  absence  of  well  defined  shapes 
in  an  image.  There  are  many  ways  that  have  been  used  to  describe  texture  from  Fourier  desciptors 
to  eigenvalue  principle  component  analysis  but  texture  description  is  very  much  an  open  topic. 
Recently,  however,  fractals  have  allowed  engineers  to  develop  some  fundamental  texture  descrip¬ 
tors.  Combined  with  new  wavelet  techniques  fractals  are  an  increasingly  popular  means  of  tex¬ 
ture  discrimination. 

Another  often  used  method  of  image  analysis  is  image  color  image  segmentation.  Most 
color  imagery  can  be  segmented  into  3  primary  colors  R,  G,  and  B  or  some  related  color  set.  Thus 
by  seeing  the  relative  colors  present  in  a  given  region  of  the  image  a  user  can  identify  objects  that 
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have  the  same  percentage  of  these  three  colors. 

Finally,  motion  is  a  final  issue  when  for  moving  image  segmentation  and  analysis. 

Many  methods  have  been  developed  recently  to  interpret  motion  within  successive  frames 
of  video.  In  order  to  put  this  motion  analysis  into  the  context  of  object  based  video  com¬ 
pression,  we  must  associate  motion  vectors  with  individual  objects.  We  therefore  explore 
a  new  way  of  performing  such  an  operation  with  a  wavelet-based  optical  flow  technique  . 

1.3  Encoder/Decoder  Design 

Speed  for  both  image  encoders  and  decoders  is  essential.  Even  with  improvements  in 
computer  hardware  speed  the  demands  placed  on  image  coders  and  decoders  are  increasing  pro¬ 
portionally.  The  demands  for  red  time  compression  and  decompression  are  placing  stress  on 
coder  design;  thus  whatever  algorithm  is  used  to  code  data  must  be  computationally  simple  and 
easy  to  implement  on  existing  real  time  hardware.  For  decompression,  any  user  of  a  given  decoder 
should  be  able  to  decompress  a  video  sequence  on  a  normal  personal  computer  with  no  special 
decompression  hardware.  Obviously,  the  compression  process  should  also  not  significantly  distort 
the  original  image  or  video  sequence. 

Scalability  refers  to  an  image  format’s  ability  to  adapt  to  different  bandwidth  and  speed 
capabilities  of  different  video  systems.  For  instance,  some  video  users  may  be  connected  directly 
to  the  internet  through  high  bandwidth  fiber  connections  and  do  not  need  to  worry  about  band¬ 
width,  but  are  more  concerned  with  image  quality.  Some  users  however  simply  want  to  connect  to 
a  video  player  or  conferencing  over  their  phone  line  and  are  willing  to  sacrifice  image  quality  for 
low  bit  rate.  Spatial  scalability  refers  to  the  ability  to  select  the  spatial  resolution  of  an  image  in 
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accordance  with  bandwidth  and  speed  requirements.  Progressive  JPEG,  many  wavelet  techniques, 
and  fractal  encoding  share  this  ability  to  present  the  user  with  a  low  resolution  copy  of  an  image 
and  then  progressively  add  detail  to  this  representation  when  the  higher  resolution  information 
becomes  available  over  the  communication  channel. 

Temporal  scalability  is  another  method  of  controlling  data  flow.  Temporal  scalability 
allows  the  user  of  a  video  compression  system  to  adaptively  select  the  number  of  frames  transmit¬ 
ted  within  a  given  time  interval  to  match  speed  and  bandwidth  requirements.  For  instance,  normal 
NTSC  video  runs  at  30  frames  per  second  or  60  interlace  fields  per  second.  Below  30  frames  per 
second  the  human  eye  can  detect  visible  gaps  in  motion  within  a  video  sequence  but  for  some 
high  speed  film  applications  higher  than  30  frames  per  second  is  necessary.  Thus  depending  on 
the  customers  needs,  the  rate  of  the  video  can  be  scaled  accordingly. 

With  the  advent  of  MPEG-4  a  new  scalability  has  emerged  based  on  video  objects  .  Video 
objects  are  used  in  conjunction  with  the  concept  of  motion  compensated  video  where  objects  in  a 
scene  are  transmitted  in  the  first  frame  and  only  their  motion  is  transmitted  in  sucessive  frames 
until  the  object  significantly  changes  shape  or  disappears  from  the  scene.  As  pattern  recognition 
principles  improve  this  technique  will  continue  to  advance. 

1.4  Overview 

To  accommodate  these  analysis  and  compression  needs  we  turn  to  the  concept  of  the  frac¬ 
tal.  The  fractal  has  emerged  to  describe  many  different  natural  phenomenon.  In  the  work  on  turbu¬ 
lence  theory  Mandelbrot  developed  the  concept  of  a  fractal  to  describe  an  object  which  is  self 
similar  at  any  spatial  scale  at  a  which  you  observe  it.The  essential  characteristic  of  fractals  is  that 


the  smallest  structure  of  the  fractal  resembles  the  overall 


Figure  1-1  Fractal  Fern 

structure.  This  evidenced  in  the  fractal  fern  of  Figure  1-1.  This  concept  of  the  fractal  and  fractal 
image  compression  has  been  mathematically  described  through  linear  approximation  theory. 

Bamsely^  *,  Jaquin^^,  Bogdan^  and  others  have  used  the  concept  of  metric  space,  the  iterated 
function  system,  and  attractors  to  describe  the  process  of  fractal  or  self  vector  quantization  com¬ 
pression.  Unfortunately  linear  approximation  theory  does  not  sufficiently  reveal  the  frequency 
structure  of  fractals  or  fractal  compression. 
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Recently,  others  such  as  Davis^®,  Rinaldo  and  Calvagano^^  have  shown  that  there  is  a 
direct  analog  of  fractal  compression  frequency  segmentation  in  wavelet  compression.  This  real¬ 
ization  is  extremely  valuable  since  wavelet  signal  processing  and  compression  techniques  are  a 
well  understood  signal  processing  method.  In  parallel  other  researchers  such  as  Ameodo  and 

Bacry^  as  well  as  Mallat"^^  have  demonstrated  in  a  1 -dimensional  sense  how  wavelets  can  be 
used  to  reveal  the  underlying  structure  of  the  fractal  and  guide  a  linear  approximation  approach 
for  synthesis  of  a  wavelet  decomposed  signal.  With  this  new  insight  into  the  nature  of  fractals  and 
fractal  compression  we  can  develop  a  new  method  of  fractal  image  compression  combining  both 
fractal  and  wavelet  methods. 

This  new  fractal- wavelet  method  has  the  ability  to  break  an  image  apart  into  the  basic  ele¬ 
ments  of  shape  and  texture.  A  basic  way  to  define  shape  of  an  object  is  in  terms  of  its  geometrical 

outline.  Recently  Mallat^  has  developed  a  wavelet  technique  which  parallels  an  earlier  method 

developed  by  Canny  to  reveal  consistent  outlines  or  shape  of  objects  in  two  dimensions.  On 
the  other  hand,  fractal  dimension  has  been  used  to  describe  the  two  dimensional  textural  regions 

in  imagery^'*.  Recently  Flandrin^^  the  wavelet  model  to  reveal  an  objects  fractal  dimension  by  the 
decay  of  energy  across  frequency  bands. 

What  we  find  through  the  combination  of  the  linear  approximation  and  wavelet  definitions 
of  shape  and  texture,  is  that  there  is  a  direct  connection  between  two.  Logically  this  connection 
makes  sense  in  light  of  the  concept  of  a  firactal  since,  in  a  fractal,  the  smallest  part  resembles  the 
overall  shape  of  an  object.  The  fractal  process  thus  defines  a  mapping  between  the  overall  object 
and  its  smallest  components.  Thus  the  shape  of  the  object  at  a  smaller  scale  is  the  object’s  texture. 
With  this  fact  in  mind  we  can  develop  a  new  concept  for  image  compression  that  combines  shape 
and  texture  models  not  only  for  analysis  but  as  an  essential  part  of  the  compression  process.  Thus 
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compressed  information  can  be  analyzed  without  decompressing  because  it  is  in  a  form  geared 
toward  analysis.  We  will  then  extend  this  principle  to  color  imagery  and  video. 
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Chapter  2 

Mathematical  Basis  of  Fractal  Image  Coding 
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2.1  Metric  Spaces 

To  develop  the  basic  technique  of  fractal  compression  we  must  find  a  mechanism  by 
which  an  image  can  be  decomposed  and  reassembled  or  mathematical  terms  analyzed  and  synthe¬ 
sized.  Do  show  how  this  mechanism  works  we  must  first  provide  an  overview  of  the  concept  of  a 
space.  The  concept  of  space  defines  a  set  of  rules  for  mapping  a  region  of  an  image  onto  itself  in 
a  reproducible  way.  This  mapping  defines  the  essential  structure  of  the  fractal  or  the  fractally 
compressed  object. 

A  space  is  a  set  between  whose  elements  certain  relations  are  prescribed  by  means  of  axi¬ 
oms;  the  set  is  said  to  have  been  given  the  structure  of  the  relevant  space.  ^  At  the  very  specialized 
space  level  we  have  Hilbert  space  (H-space)  which  is  a  special  case  of  Banach  space  (B-space), 
which  is  a  linear  space  and  complete  metric  space,  which  is  an  important  subset  of  more  general 
class  of  topological  spaces.  Figure  2-1  shows  the  relationships  between  spaces.  A  H-space  is  an 

immediate  generalization  of  while  a  topological  space  is  at  a  higher  level  the  hierarchy  and 

can  be  seen  as  a  generalization  of  the  concept  of  an  open  set  in  R^.  A  (linear)  vector  space  V  is  a 
nonempty  set  of  elements  called  vectors  for  which  we  define  two  algebraic  operations:  addition  of 
vectors  and  multiplication  of  a  vector  by  a  scalar  (number).  A  finite  set  of  elements 
{/^.  €  V:  i  =  1, . ,n}  is  independent  if  a  set  of  scalars  {A.^.}  such  that  X,j/ ^  + . ® 

cannot  be  found.  A  finite  set  S  c  V  is  a  basis  in  V  if  it  is  linearly  independent  and  any  vector  V 
is  a  linear  combination  of  elements  in  S. 
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Definition  2.0  A  set  X  is  called  a  metric  space  if  for  each  pair  of  elements  f,  g  €  X  there  is 
associated  a  non-negative  real  number  p(f,g),  the  distance  between  f  and  g  subject  to  the  follow¬ 
ing  conditions: 


1.  Pif,  8)^0-,  pif,g)  =  0  iff  f  =  g 

2.  Pif,g)  =  Pig,f) 

3.  p(/,  g)  <  p(/,  h)  +  p(/i,  g)  for  any  h  e  X  (the  triangle  inequality)Y 

2.1.1  Banach  Space 

We  add  analytic  properties  to  a  vector  ^  space  V  when  we  define  a  norm  (generalized 
absolute  value).  A  norm|H|  :V  — >  9?^  associates  with  every  element  in  V  a  nonnegative  real  num¬ 
ber.  A  normed  space  (a  vector  space  having  a  fixed  norm)  is  metrizable  if  we  set: 

P(/.*)  =  ll/-*li  (2I-1) 

for  all  (/,  g  e  V)y.  We  note  that  only  the  distance  functions  for  which  p(f,g)  =  p{f-g,0)  will  intro¬ 
duce  a  norm.  Very  important  in  applications  are  the  complete  normed  spaces  which  are  called 
Banach  spaces  (B-Space).  In  a  B-space  the  notions  of  completeness  and  convergence  take  the 
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form  of  convergence  in  norm.  A  sequence  {/„}  in  B-space  converges  iff.  there  exists  an  element 


/  G  V  such  that 


lim 


Vn-n 


0 


(2.1.2) 


2.1.2  Hilbert  Space 

We  can  add  geometric  properties  such  as  orthogonality  between  two  elements  in  linear  space 
through  a  scalar  product  (inner  product). 

Definition  2.1  A  scalar  product  (inner  product)  (.1.)  on  a  linear  space  V  over  a  filed  of  scalars 
K  is  a  function  (f,g):  V  x  V  — >  K  such  that  the  following  conditions  are  satisfied  for  all 
(/,  g,he\)  and  all  a,  P  e  k 

1.  ■  if,a9+^h)  =  aif\g)  =  Hf\h 

2.  if\9)  =  i9\f) 

3.  (/|/)>0#/^C 


A  pre-Hilbert  space  is  a  vector  space  V  with  scalar  product.  This  space  becomes  a  normed  space  if 
we  define  a  norm  on  V  through. 
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1 

ll/ll  ^=Uf\f)'^ 


If  this  space  is  a  B-space  then  we  call  V  a  Hilbert  space  (H-space). 


Topological 

Space 


Metric  Space 


Complete 
Metric  Space 


I - - 1  I - 1 

^Banach  Space  ^  ^  Vector  Space  ^ 


Hilbert  Space 


Figure  2-1  Diagram  of  Spaces 


2.2  Fixed  points 

Fixed  points  are  now  the  beginnings  of  a  method  by  which  we  can  transform  a  space  into  a 
desired  function  or  the  basic  principle  in  the  mechanism  of  constructing  a  fractal.  This  approach 
begins  with  the  method  of  successive  approximations.  We  want  to  apply  the  sucessive  approxima¬ 
tion  method  in  the  form  =  r ^  2  First  we  introduce  contractive  operators  on  a  metric 

space  and  then  we  present  the  main  result 

Definition  2.3  An  operator  TiilcQ— on  a  metric  space  (X,p)  is  called  s  contractive  iff  for 
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0  <  ^  <  I  p{Tx,  Ty)  <  5p(A:,  _y)  for  allx,  y  e  Q,.  Ifs  =  1,T  is  called  nonexpansive;  if  the  above 
equation  holds  for  0  <s<°°  ,T  is  called  Lipschitz  continuous.  If  p{Tx,  Ty)  <  p  (x,  y)  for  all 
x,yeQ.  with  x^y  then  T is  called  contractive. 

Theorem  2.0  (Banach  1922)  Given  an  s-contractive  operator  TiQ  c  Q  -»  Q  where  is  closed 
and  non-empty  in  the  metric  space  (X,p),  the  following  are  true: 

1.  Existence  and  uniqueness:  T  has  exactly  one  fixed-point  on  Q  or  equivalently  or  equiv¬ 
alently  equation  4  has  a  unique  solution. 

2.  Convergence  of  iteration:  the  sequence  {Xj,}  of  sucessive  approximations  convergence 
to  the  solution  x  for  any  choice  of  the  initial  approximation  xq. 

3.  Error  estimates:  for  all  n  =  0,1,2,...  we  have  the  a  priori  error  estimate 

n 

(2.2.1) 

and  the  a  posteriori  error  estimate 

4.  Rate  of  convergence:  for  all  n  =  0,1,2,...  we  have 


(2.2.3) 
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To  apply  the  fixed  point  principle  and  the  contraction  mapping  theorem  (2.0),  we  need  a 
slightly  different  setting.  Let  X  be  a  complete  metric  space  and  H(X)  be  the  set  of  nonempty  com¬ 
pact  subsets  of  X.  The  distance  between  a  point  x  and  a  set  A  c:  X  is 

p(x,  A)  =  /«/{p(x,  a):  a  6  A}  (2.2.4) 

a 

The  Hausdorf  distance  is  defined  for  A,B  6  H(X)  by 

p(A,B)  =  sup{p(a,B),  p(b,A):  a  6  A,  e  B}  it  can  be  shown  that  (H(X),  pj,)  is  a  complete 
metric  space  and  we  can  use  Banach’s  fixed  point  fixed  point  theorem  (2.0).  Let’s  define  W  = 
(Wn, . jWjsj}  a  finite  collection  of  contraction  mappings  in  X  and  let  W(K)  =  u  w^.(K) ,  K  e  X 

It  follows  that  W  fs  a  contraction  map  on  H(X)  and  we  can  use  the  contraction  mapping  principle. 
This  result  can  be  reformulated  on  the  metric  space  of  X. 

Theorem  2.0(Hutchiiison).  Let  (X,p)  be  a  complete  metric  space  and  W  ={W], . .w^}  be  a 

finite  set  of  contraction  maps  on  X.  Then  there  exists  a  unique  closed  bounded  set  K  such 
N 

thatK  =  u  h'  .(K)  .  Furthermore,  K  is  compact  and  is  the  closure  of  the  set  of  fixed  points 

i=l  ‘ 

k.  ■  of  the  finite  compositions  w.  o . °w.  .  Starting  with  an  arbitrary  set  A  c  X  the 

*1’ . ’V  1  P 

iterative  methodw”(A)  K  converges  in  the  Hausdorff  metric. 

2.2.1  Iterated  Function  Systems 

Iterated  function  systems  is  the  discrete  or  finite  mathematical  mechanism  for  the'^’^khe 
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synthesis  of  a  fractal.  There  two  types  of  IPS,  global  and  local.  We  will  first  discuss  the  more  gen¬ 
eral  global  IPS  and  then  move  on  to  local  IPS. 

The  collection  W  =  {wj, . .w^}  is  know  as  an  iterated  function  system  (IPS).  In  binary 

2 

and  some  new  forms  of  greyscale  image  coding  we  take  X  =  A  <z  91  a  rectangle  and  let  K  =  I  be 
the  image  pattern.  Usually  the  attractor  K  has  fractal  properties  like  detail  at  every  scale.  It  turns 
out  that  images  which  look  very  complicated  can  be  represented  with  a  very  small  number  of 
transformations. 

One  of  the  most  important  questions  is  if  we  can  build  a  fractal  code  that  approximates 
well  enough  a  given  image  or  signal.  One  approach  is  by  taking  a  fractal  code  T  by  construction, 

starting  with  the  assumption  the  original  image/is  a  fixed-point  /  =  T{f)  =  ijif) 

Because  we  limit  the  possible  maps  T  and  we  do  not  know  the  fixed  point  T  beforehand,  we  are 
looking  for  some  map  such  that  p(/,  Tf)  <  £  .A  bound  to  the  distance  of  the  fixed-point  to  the 
original  can  be  estimated  if  T  is  the  contraction.  As  a  consequence  of  the  contraction  mapping 
principle  2.0  we  have  the  collage  theorem  that  can  be  reformulated  in  the  metric  space  (/,  p) . 
Theorem  2.1  In  the  metric  space  (/,  p)  of  images  with  the  same  support  let  the  fractal  code 
T  €  r  c  B(I,  I)  be  a  contraction  with  Sr  <  1 .  Then  we  have  the  following  upper  bound  of  the  dis¬ 
tance  between  the  fixed-point  f*  =  T(/*)  and  the  original  image  f  ^  I  - 

i.  Srj^ 


(2.2.5) 
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2.2.1.1  The  Global  IFS 

Recall  that  an  iterated  function  system  (IFS)  is  a  collection  of  transformations 
{wj, . ,Wn}  from  a  metric  space  (X,p)  into  itself.  If  W{K)  =  w,(K),  K  €  X  is  a  contraction 

i 

on  the  metric  space  (X,p),  then  iterating  W  starting  with  any  element  in  the  space  converges  to  the 
unique  limit  I.  Such  an  IFS  is  known  as  a  global  IFS  since  it  operates  on  the  entire  metric  space. 

An  IFS  can  encode  black  and  white  images  when  I  is  a  subset  of  R^. 

A  random  iterated  function  system,  also  called  an  (EFSp)  is  an  IFS  together  with  a 

probability  vector  {pj . ,Pi,}.  A  contractive  EFSp  determined  by  {X,Wi,Pn}"j  will  converge  to  a 

unique  set  I  e  X  which  is  the  support  of  the  associated  invariant  measure.  Invariant  measures 
generated  by  dynamical  systems  are  studied  in  ergodic  theory  and  will  be  discussed  in  Chapter  4 
in  detail.  A  gray  scale  image  can  be  modeled  as  the  natural  measure  of  an  EFSp  and  can  be  gener¬ 
ated  using  the  chaos  game.  Consider  the  attractor  I  c  A ,  where  A  is  a  bounded  subset  of  usu¬ 
ally  a  rectangle.  Then  I  is  an  element  of  H(X)  together  with  the  Hausdorf  distance  p(.,.)  is  a 
complete  metric  space  under  certain  conditions.  To  generate  the  attractor  I  we  start  with  any  point 
Zq  e  I  then  all  the  points  Zj  will  be  the  attractor,  else  after  a  number  of  iterations,  Zj  will  be  the 
attractor.  In  practical  implementations,  if  we  do  not  know  the  attractor,  we  can  discard  the  first 
iteration.  The  gray  image  appears  as  the  natural  measure  p  generated  by  a  dynamical  system  we 
have  described.  Let  M(\)  be  the  space  of  all  Borel  probability  measures  on  A  and  let  B  be  a  Borel 


subset  on  A.  Then 
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|X(B)  =  -  (2.2.6) 

/=  1 

where  lg(z,)  is  the  indicator  function  for  the  set  B.  To  visualize  the  measure  on  a  computer  dis¬ 
play,  we  have  to  compute  |J.(B,)  for  each  pixel  and  rescale  the  dynamic  values  to  match  the 

dynamic  range  of  that  display.  Now  we  can  rewrite  the  fixed-point  equation  directly  in  the  space 
of  probability  measures. 

Under  some  technical  conditions,  M(A)  endowed  with  the  Hutchinson  metric  is  a 
complete  metric  space  and  we  can  define  the  Markov  operator  which  is  the  contraction: 

n 

W(B)  =  ^  (2.2.7) 

i=  1 

The  contractivity  of  r(p,)  was  proved  for  the  hyperbolic  case  when  all  {Wj}  are  contractions.  We 

have  seen  that  {  w,  }  j”  uniquely  determines  the  attractor  I  of  the  support  of  the  IFSp.  For  different 
sets  of  probabilities  {pj}  we  obtain  different  gray  level  images,  all  having  the  same  support.  Sev¬ 
eral  IFS’s  can  be  combined  to  obtain  a  complicated  image.  Next  we  look  at  a  special  type  of  IFSp 
where  the  transformations  {w,  }  are  restricted  to  a  particular  region  of  the  space  A. 

2.2.1.2  The  Local  IFS 

The  local  iterated  function  system  was  named  fractal  block  coding  by  Jacquin  ,  local 
interated  function  system  (LIFS)  and  the  fractal  transform  by  Bamsely^^  or  partitioned  iterated 
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function  system  (PIFS)  by  Fischer,  Jacobs,  and  Boss^^.  It  is  the  basis  of  what  is  currently  known 

2 

as  fractal  image  compression.  The  support  of  the  gray  scale  image  is  a  bounded  rectangle  I  e  R 

which  is  invariant  under  a  block-wise  continuous  transformation  W:R  R  .  In  a  PIFS  we 
obtain  more  flexibility  by  having  the  domain  of  each  wj  restricted  to  a  subset  D  •  c  I .  Note  we  can 
find  an  infinite  number  of  combinations  {wj}  with  the  same  invariant  set  I.  Also  W  may  have  sev¬ 
eral  invariant  sets. 

A  grey  level  image  is  modeled  as  the  invariant  probability  measure  |X  generated  by  the 
iteration  of  W  on  the  complete  metric  space  H(A)  and  the  associated  mapping  T  given  by  {w^pj} 

operating  on  the  space  of  probability  measures  M  with  the  support  A  as  before  T :  Af  (A)  — >  Af  ( A) 
Since  in  PIFS  coding,  the  invariant  set  of  the  IFS  is  the  space  A,  we  have  I  =  A  =  supp{\i) , 

Contractivity  of  each  Tj  is  given  by  p,  <  1  and  guarantees  convergence  of  the  iteration  of  T 

to  limit  p,*  in  the  space  of  probability  measures.  The  limit  p*  is  the  approximation  of  the  image 
p  to  be  coded,  and  T  is  called  the  fractal  code  for  p  .  In  a  practical  coding  system,  I  is  parti¬ 
tioned  in  rectangular  blocks  by.  Block  mapping  Wjj  and  a  massic  (greytone)  transformation  with  a 
gain  factor  py  for  each  block  such  that 

r(n)  =  = 

Coding  the  image  p  is  equivalent  to  finding  all  mappings  Ty,  the  restriction  of  T  to  each 
block  of  the  partition.  For  decoding  we  use  the  chaos  game  as  before,  or  we  can  apply  the  Markov 
operator  T  in  a  successive  approximation  algorithm  (start  with  any  initial  image  v  e  M ) 

p*  =  (r)"(v),n^oo 


(2.2.9) 
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The  problem  of  finding  T  =  ^  T-j  makes  ITT  coding  very  complex  and  thus  we  seek  an  new 
means  of  determining  the  transformation  T. 

2.3  Fractal  Compression  Techniques 

Extending  the  concept  of  the  IPS  to  practical  application  there  are  two  primary  methods 
that  have  emerged.  Jaquin’s  Algorithm^^  was  based  on  self  vector  quantization  techniques  where 

the  codebook  is  taken  from  the  image  itself.  Yuval  Fisher  ^^extended  this  technique  for  a  variable 
partitioning  of  the  image  based  of  feature  size  and  resolution. 

2.3.1  Jaquin’s  Algorithm 

Thus  we  develop  what  is  commonly  known  as  the  fractal  block  transform  shown  below  in  Fig¬ 
ure  4a 


2-2  Jaquin’s  1-D  Block  Mapping 

This  process  thus  exploits  the  self-similarity  of  two  spatial  scales  to  compress  an  image.  The  pro- 
cess  of  fractal  encoding  is  thereby  described  by  equation  2.3.1 
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.  fix)  =  Tfix)  =  U^fix)  +  b  =  Q^fi2x)  +  b 


(2.3.1) 


where  f(x)  is  the  image  to  be  transformed  and  T  is  a  contractive  operator  with  unique  fixed  point 
f.  Encoding  f  means  finding  and  operator  T  having  a  fixed  point  f  approximately  equal  to  f  while 
decoding  is  equivalent  to  finding  the  fixed  point  f  by  iterating  T  starting  with  an  image  selected  at 
random.  ULrepresents  the  transformation  applied  to  the  domain  blocks  which  both  grey  levels  and 
decimates;  Ql  represents  a  simpler  Ul  that  simply  scales  the  already  decimated  domain  blocks, 
and  b  is  the  intensity  offset  applied  to  the  domain  blocks.  Note  that  f,  b  e  V  where  V  is  a  discrete 
and  finite  dimensional  space. 

In  two  dimensions  the  fractal  encoding  processes  can  be  described  in  4  steps.  Geneti¬ 
cally,  fractal  encoding  involves  taking  a  block  of  pixels  in  a  image  known  as  a  domain  block,  spa¬ 
tially  subsampling  the  block,  and  matching  it  to  a  smaller  block  of  pixels  known  as  a  range  block. 
Step  1:  Divide  Image  Into  Range  and  Domain  Blocks 


Domain  Blocks 


Variable 
Position  2x 
Size  of 

Range  Blocks 


Step2:  Match  Domain  Blocks  to  Range  Blocks 
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Get  Best 
Possible  Match 
From  Pool 
of  Domain 
Blocks  for 
Given  Range 
Block 

Domain  Block  Transformation  Range  Block 

Step  3:  Build  a  Table  of  Domain  Block  Positions,  Rotations  &  Scalings  -  This  is  the  Compressed 
Image 

Step  4:  Reverse  Process  to  Rebuild  Image  Start  with  blank  image  with  all  pixels  set  to  some  arbi¬ 
trary  value  and  iterate  on  transformations  from  coded  table. 


Apply  Rotatation,  Decimate, 
Scale  in  Intensity 


Domain  Blocks  Apply  Scale/Rot  Range  Blocks 


Figure  2-2  Jaquin’s  2-D  Block  Mapping 


As  is  shown  in  Figure  2-3  Bogdan^  and  Davis^®  showed  the  fractal  block  mapping  technique  takes 
low  frequency  coefficients  and  maps  them  to  high  frequency  coefficients.  Thus  the  block  map¬ 
ping  uses  the  low  frequency  information  to  approximate  the  high  frequency  information. 
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Jaquins  Fractal  Frequency  Mapping 


Encoding 


Power 


Decoding 


Power 


Figure  2-3  Frequency  Analysis  of  Jaquin’s  Technique 

2.3.2  Fischer’s  Algorithm 

The  problem  with  Jaquin’s  technique  is  that  once  a  range  and  domain  block  size  is  chosen, 
all  features,  large  or  small  must  be  represented  with  those  block  sizes.  To  remedy  this  problem, 

Yuval  Fischer  has  modified  the  original  fractal  transform  by  incorporating  adaptive  block  sizing 
depending  on  the  amount  of  information  within  each  fractal  range  and  domain  block.  This  pro¬ 
cess  gives  better  overall  results  simply  because  it  preserves  spatial  detail  in  those  blocks  where 
information  is  needed  and  achieves  high  compression  for  those  regions  which  are  relatively 
homogenous.  Fischer  uses  different  tiling  techniques  such  as  quadtree  partitioning  where  blocks 
remain  square  as  in  Jaquin’s  algorithm  and  rectangular  quadtree  which  gives  greater  adaptability 
to  spatial  features.  Also,  Fischer  as  made  use  of  triangular  range  and  domain  blocks. 
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Fischer’s  Fractal  Frequency  Mapping 


Encoding 


Power 


mapping 


Decoding 


mapping 


Power 


2-4  Frequency  Analysis  of  Fischer’s  Technique 
As  is  shown  in  the  Figure  2-4,  like  Jaquin’s  technique,  Fischer  uses  the  low  frequency 
information  to  reconstruct  the  high  frequency  information.  Unlike  Jaquin  however,  Fischer 

reconstructs  with  multiple  frequency  scales  from  low  to  high  frequency  .  The  width  of  these  fre 
quency  scales  for  the  quadtree  method  is  powers  of  two.  Looking  at  the  fractal  compression  pro 
cess,  we  will  shown  in  Chapter  3  a  similarity  to  the  wavelet  transform  frequency  breakdown. 
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Chapter  3 

Wavelet  Image  Coding  and  Parallels  to  Fractal  Image  Coding 
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3. 1  Wavelets 

As  stated  at  the  end  of  Chapter  2,  researchers  such  as  Davis^*^  have  seen  a  direct 
similarity  between  fractal  and  wavelet  compression.  Thus,  we  zu:e  motivated  to  study  the  wavelet 
compression  methods  from  both  a  1  and  2  dimensional  perspective  to  gain  a  better  understanding 
of  fractal  image  compression  and  develop  a  means  of  improving  compression  performance.  First 
we  will  describe  the  general  properties  of  wavelets.  We  will  then  introduce  two  types  of  wavelets, 
the  QMF  Haar  and  the  Gaussian  Derivative  Spline.  We  will  show  how  the  frequency  structure  of 
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the  Haar  wavelet  mirrors  that  of  Jaquin  and  Fischer’s  fractal  compression  and  then  show  how  the 
gaussian  derivative  Spline  has  desirable  properties  which  will  be  used  to  improve  on  the  fractal 
compression  process.  Finally  we  will  give  two  examples  of  conventional  wavelet  image  coding 
techniques  for  later  comparison  with  the  new  fractal  wavelet  method. 

3.1.1  Definition 

Practically  speaking,  a  wavelet  is  a  means  of  approximating  a  function  with  a  set 
of  linear  vectors.  This  set  of  vectors  forms  a  basis  set.  This  basis  is  orthogonal  if  the  scalar  prod¬ 
uct  of  the  vectors  is  equal  to  zero.  Vectors  are  formed  from  dilating  and  translating  of  some 

mother  wavelet  vector  \|/(x)  which  is  based  on  some  scaling  function  <l)(x)  .  The  wavelet 
decomposition  thus  forms  a  linear  transform.  This  linear  transform  can  be  formulated  in  terms  of 
linear  filters  \|r(x)  typically  corresponds  to  a  highpass  filter  g(k)  (t)(x)  typically  corresponds  to 
a  lowpass  filter  h(k). 

The  only  constraint  imposed  on  a  wavelet  function  \\f{x)  real  or  complex  valued,  in 
order  to  be  a  wavelet  is  the  admissibility  condition  that  requires:  i  f  \|t(x)  is  integrable  this  actu¬ 
ally  implies  that 

J  \|r(x)t&  =  0  or  \ir(|k|  =  0)  =  0  (3.1.1) 

rT 


3.1.1.1  Properties 

The  properties  of  the  wavelet  transform,  admissibility,  similarity,  invertibility,  regularity, 
and  moment  cancellations  are  described  as  follows: 

Admissibility:  For  an  integrable  function,  this  means  that  its  average  is  zero. 

Similarity:  The  scale  decomposition  should  be  obtained  by  the  translation  and  dilation  of 


only  one  “mother”  function. 
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Invertibility:  There  should  be  at  least  one  reconstruction  formula  for  recovering  the  signal 
exactly  from  its  wavelet  coefficients  and  for  allowing  the  computation  of  energy  or  other  invari¬ 
ants  directly  from  them. 

Regularity:  In  practice  the  wavelet  should  also  be  well  localized  on  both  sides  of  the  Fou¬ 
rier  transform,  namely  it  should  be  concentrated  on  some  finite  spatial  domain  and  be  sufficiently 
regular. 

Cancellations  For  some  applications,  in  particular,  turbulent  signal  analysis,  the  wavelet 
should  not  only  be  of  zero  value(admissiblity).  but  should  also  have  some  vanishing  high-order 
moments. 

3.1.1.2  Analysis 

The  analysis  of  a  signal  involves  decomposing  with  a  set  of  functions  which  are  derived 
from  some  common  “mother”  function.  From  the  function  \|r  the  so  called  mother  wavelet, 
^^we  generate  the  family  of  continuously  translated,  dilated  and  rotated  wavelets: 

_n 

V/;('0(x)  =  1  (3- 1-2) 

with  /  e  as  the  scale  dilation  parameter  corresponding  to  the  width  of  the  wavelet  and 

x'  e  R"  as  the  translation  parameter  corresponding  to  the  position  of  the  wavelet;  1  and  x’  are 
dimensionless  variables.  1  denotes  the  wavelet  scale  because  it  corresponds  to  the  length  scale  at 
which  analyze  f(x)  and  the  position  of  the  analyzing  wavelet  x’,  because  it  indeed  corresponds  to 
the  actual  position  in  physical  space;  we  must  also  distinguish  x’  and  x  which  will  be  used  an  inte¬ 
gration  variable.  The  rotation  matrix  Q  belongs  to  of  rotations  in  R”  and  depends  on  the  n(n-l)  / 
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2 

2  Euler  angles© .  The  factor  1  is  a  normalization  which  causes  all  the  wavelets  to  have  the  same 

L  norm,  therefore  all  wavelets  will  have  the  same  energy  and  the  wavelet  coefficients  will  corre¬ 
spond  to  energy  densities. 

3.1.1.3  Synthesis 

Reconstruction  of  the  original  signal  is  known  as  synthesis.  The  admissibility  condition 
implies  the  existence  of  a  reproducing  kernel.  We  can  therefore  recover  the  signal  f(x)  from  its 
wavelet  coefficients  as  shown  in  equation  3.1.5. 

oo 

/(X)  =  c;T  J  (3-1-3) 

3.1.1.4  Comparison  With  The  Fourier  lyansform 

The  primary  advantage  of  the  wavelet  transform  is  its  ability  to  localize  in  time  or  frequency 
where  the  fourier  transform  has  great  localization  in  frequency  but  poor  localization  in  time. 
Given  an  absolutely  integrable  function  its  Fourier  Transform  is  defined  by 

oo 

F(w)  =  (3.1.4) 

— oo 

Likewise  the  discrete-time  Fourier  Transform  is  Defined  as 

F[k]  =  X  (3.1.5) 

«  =  0 

The  time  and  frequency  localization  properties  of  the  wavelet  transform  can  be  shown  in  the  fol¬ 
lowing  Figure  3-1  using  a  space  frequency  analysis  plot 
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Figure  3-1  Space  vs.  Frequency  Plots  of  Fourier  vs.  Wavelet  Transform 
And  finally  the  short  time  Fourier  Transform  is  defined  by 


STFTf(Gi,t)  =  J  (o*(r-x)/(r)e 


(3.1.6) 


Which  like  the  wavelet  transform  is  localized  in  frequency  because  it  is  windowed  in  time. 


3.1.2  Bases 

The  expansions  of  a  mother  wavelet  are  described  as  the  bases  of  the  wavelet.  Such 
expansions  can  be  described  as  orthogonal  or  biorthogonal.  The  elements  of  /,  gf  e  H  where 
His  a  H-space,  are  said  to  be  orthogonal  if  (fig)  =  0.  A  system  {f^gj}  constitutes  a  par  of  biorthog¬ 
onal  bases  of  a  Hilbert  space  H  if  and  only  if 

(a)  For  all  i,j  in  Z  (integers) 

(filfj)  =  5[i-j]  (3.1.7) 

(b)  There  exists  strictly  positive  constants  Aq,  Aj,  Bq,  Bj  such  that  for  all  y  in  H 
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(3.1.8a) 

k 

J]|(gj^,y)|  <Bj|y||2  (3.1.8b) 

k 

Let  n  be  a  closed  subset  in  a  complete  metric  space  X.  A  fixed  point  of  a  given  mapping 
T:Q  ^  O ,  not  necessarily  linear  is  every  solution  x  e  Q  of  the  equation  A  x  =  Tx 

3.1.3  Moments 

At  this  point  it  is  important  to  note  that  the  number  of  vanishing  moments  of  the  wavelet  is  n 
where 

oo 

0  =  j  (3.1.9) 

— oo 

Information  about  the  number  of  moments  in  a  wavelet  function  can  be  critical  to  determining 
certain  frequency  properties  of  the  wavelet  transform  such  as  Alpha  and  Holder  exponents.  This 
fact  will  be  discussed  in  detail  later  in  Chapter  4. 

3.2  Orthogonal  Structures:  QMF  Haar 

To  begin  with  a  specific  wavelet  basis  we  describe  the  ^^Haar  Wavelet.  The  Haar  Wavelet 
is  one  of  the  most  basic  wavelet  basis  and  as  we  will  show  later  can  be  used  to  explain  the  basics 
of  fractal  compression  since  the  averaging  of  a  domain  block  to  obtain  a  range  block  is  simply 

applying  the  Haar  lowpass  filter  to  the  image^®. 

We  begin  as  follows  \j;(x).  It  is  a  step  function  taking  values  1  and  -1  on  [0,0.5)  and  [.5,1) 


respectively  as  follows: 
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\\fjk{x)  =  const  ■  \\l{2-'x-k) 

define  a  an  orthogonal  basis  in  L^(R),  the  space  of  square  integrable  functions.  This  means  that 
any  element  in  L^(R)  may  be  represented  as  a  linear  combination  (possibly  infinite  of  these  basis 
functions 


1.0  — 
0.9  - 
0.8- 
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Figure  3-2  Haar  Wavelet 

A  general  approach  to  using  the  Haar  Wavelet  to  decompose  a  signal  is  as  follows  devel¬ 
oped  by  Stephan  Mallat.  We  start  with  the  space  L?  of  square  integrable  functions.  The  MRA  is 
an  increasing  sequence  of  close  subspace  {Vj}7e  z  which  approximates  L  (R). 

3.2.1  Basis  Derivation 

Everything  starts  with  a  clever  choice  of  the  scaling  function  (j) .  Except  for  the  Haar 
wavelet  basis  for  which  (])  is  the  characteristic  function  of  the  interval  [0,1),  the  scaling  function  is 
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chosen  to  satisfy  some  continuity,  smoothness  and  tail  requirements.  But,  most  importantly,  the 
family  {(^{x  -k),ke  Z}  forms  an  orthonormal  basis  for  the  reference  space  Vq.  The  following 
relation  describes  the  analysis: 

-cy.iCVocVi---  (3-2.1) 

The  spaces  Vj  are  nested.  The  space  L^(R)  is  a  closure  of  the  union  of  all  Vj.  In  other  words, 

U  y  g  2^ j  is  dense  in  L^(R) .  The  intersection  of  all  Vj  is  empty. 

fix)  e  Vj  ^  filx)  eVj^.JeZ  (3.2.2) 

The  spaces  Vj  and  Vj+j  are  “similar”.  If  the  space  Vj  is  spanned  by  <|)yjt(jc),  ke  Z  then  the 
space  Vj+j  is  spanned  by  (|)y  ^  i  /^{x),  ke  Z.  The  space  Vj+j  is  generated  by  the  functions 

We  now  explain  how  the  wavelets  enter  the  picture.  Because  Vq  c  Vj ,  any  function  in  Vq 
can  be  written  as  a  linear  combination  of  the  basis  functions  J2^(2x  -  k)  from  V  j  in  particular: 

<t)(x)  =  ^  ^hik)j2(^(2x-k)  (3.2.3) 

Coefficients  h(k)  are  defined  as  {(|)(x),  J2^i2x  -  k)) .  Consider  now  the  orthogonal  complement 
Wj  of  Vj  to  Vj+i  (i.e.  Vj+i  =  Vj  0  W j).  Define 

y^fix)  =  72^  ^i-l)''hi-k+mi2x-k)  (3.2.4) 

It  can  be  shown  that  {  J2^  iV(2x  -k),ke  Z}  is  an  orthonormal  basis  for  Wj.  Again  the  sim¬ 
ilarity  property  of  MRA  gives  that{2'^^^\|/(2A:  -k),ke  Z}  is  a  basis  for  Wj.  Since 
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u  jez^j  ~  jsZ^j  is  dense  in  L^(R),  the  family  \|/y^(x)  =  2'^^^V|/(2jc-/:),  e  Z  is  a 
basis  for  L^(R). 

2 

For  a  given  function  /  €  L  {R)  one  can  find  N  such  that  f  e 

some  up  to  preassigned  precision  (in  terms  of  L2  closeness).  If  g-  e 

M 

f  N  -  f  N-i"^  Sn-i'^  ^  S  '^In-m 

,=  i  n-m 

The  above  equation  is  the  wavelet  decomposition  of/. 

3.2.2.  QMF  structure 

The  filter  bank  is  the  means  by  which  a  wavelet  is  implemented  to  decompose  a  signal  or 
image  into  different  frequency  bands.  There  are  many  filter  bank  implementations.  The  quadra¬ 
ture  mirror  filter  (QMF)  is  here  is  presented  as  an  example  as  one  of  the  most  basic  filter  bank 

•  •  79 

implementations  and  is  routinely  used  for  image  decomposition 

Repeating  the  above  description  in^^  ’^^terms  of  signal  processing  we  recall  that 

=  X  6  z^(*)'n/2<|)(2x  -  k)  (3.2.6) 

and 

¥(^)  =  X  (3.2.1) 

The  1^  sequences  {h(k),  ke  Z}  and  {g(k),  fe  e  Z}  are  quadrature  mirror  filters  in  the  terminol¬ 
ogy  of  signal  analysis.  The  connection  between  hand  g  is  given  by: 


approximates/  up  to 
and  /,•  e  F,-  ,  then 


g(n)  =  (_l)"/i(l-n) 


(3.2.8) 
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The  sequence  h(k)  is  the  low  pass  filter  while  g(k)  is  the  high  pass  filter.  The  following  properties 
of  h(n),  g(n)  can  be  proven  by  using  Fourier  transforms  and  orthogonality: 

^h{k)  =  Jl,  '^gik)  =  0  As  a  result  this  filter  structure  k  is  shown  in  Figure  3-3 


Figure  3-3  Biorthogonal  Filter  Bank  Structure 
The  most  compact  way  to  describe  Mallat’s  multi-resolution  analysis  MRAand  determine 
the  wavelet  coefficients  is  the  operator  representation  filters.  The  mallat  multiresolution  frequency 
structure  is  shown  graphically  below  in  Figure  3-4 


Figure  3-4  Mallat  Wavelet  Frequency  Structure 

It  is  useful  to  note  that  this  frequency  structure  of  the  MRA  approach  is  identical  to  the  Fischer’s 
fractal  frequency  decomposition  shown  if  Figure  2-4.  This  makes  sense  since  at  each  stage  of 
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domain  to  range  mapping  on  a  quad-tree  partition  essentially  applies  a  Haar  H  filter  to  the  domain 

image  and  then  maps  this  information  to  the  next  higher  frequency  band  of  range  blocks. 

For  a  sequence  a  =  {an}  the  operators  H  and  G  are  defined  by  the  following  coordinate- 
wise  relations: 

(Ha)t  =  '2„h(,n-2k)a„  (3.2.9a) 

(G»),  =  £„«(«- 2<:K  (3-2-9b) 

The  operators  H  and  G  correspond  to  one  step  in  the  wavelet  decomposition.  The  only  dif¬ 
ference  is  that  the  above  definitions  do  not  include  the  Jl  factor. 

Denote  the  original  signal  by  c^"\  If  the  signal  is  of  length  2",  then  c^"^  can  be  represented 
by  the  function /(x)  =  /s  .  At  each  stage  of  the  wavelet  transformation  we 

move  to  a  coarser  approximation  c^'^^  by  =  Hc^^  and  d^"*^  =  Gc^\  Here,  d^‘^^  is  the  detail 
lost  by  approximating  is  the  detail  lost  by  approximating  c^'*^ .  The  discrete  wavelet  transforma¬ 
tion  of  a  sequence  of  length  2"  (notice  that  the  sequence  c^'*^  has  half  the  length  of  c^^): 

(/" " ' \  ~ . (3.2. 10) 

Thus  the  discrete  wavelet  transformation  can  be  summarized  as  a  single  line: 

y  ^  {Gy,  GHy,  GH\ . GH''~\,  GH'^y)  (3.2.11) 

The  reconstruction  formula  is  also  simple  in  terms  of  H  and  G;  we  first  define  a  joint  oper- 

He 

ators  H  and  G  as  follows: 

{Ha)n  =  Y.Ji{n-lk) 


(3.2.12a) 
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(G* a)n  =  X  khin-2k) 


(3.2.12b) 


Recursive  application  leads  to: 


n  -  1 


(Gy,  G//y,  GH  y, . GH”"  y,  GH’'y)^y  =  ^  (3.2.13) 

;  =  0 


The  above  equations  (generating  equations)  used  with  the  Haar  wavelet  are  as  follows: 


(j)(x)  =  (|)(2a:)  +  (1)(2x-1)  = 

V2  42 


(3.2.14) 


\|/(x)  =  <^(2x)  +  <i^i2x-l)  =  ■^fy(2x)-^^42isf{2x-\) 

42  42 


(3.2.15) 


The  filter  coefficients  in  the  above  equations  are: 


/i(0)  =  h{\)  =  4=  ^(0)  =  -g(l)  =  4= 
42  42 


(3.2.16) 


To  get  the  wavelet  coefficients  we  multiply  the  components  of  d^\  j  =  0,1,2  and  with 
the  factor  2'^^^.  Simply: 

=  2~^^^d[^\  0<j<Ni=3)  (3.2. 17) 

It  is  interesting  that  the  Haar  wavelet  case  2'^^^Cq  =  Cqq  =  0.5  is  the  mean  of  the  sample  y. 
The  entire  frequency  bank  decomposition  is  shown  below  in  Figure  3-5 
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Figure  3-5  Multiresolution  QMF  Frequency  Decomposition 
Thus  we  have  a  direct  description  of  how  the  fractal  image  compression  process  divides 
itself  in  terms  of  frequency  representation  by  means  of  the  QMF  Haar  frequency  decom¬ 
position. 

3.2.2  2-D  representation 

The  QMF  two  dimensional  decomposition  is  shown  below.  The  give  image  is 
decomposed  in  each  direction  by  a  high  and  low  pass  filter  and  downsampled  in  those  directions. 
Next  the  wavelet  is  decomposed  in  the  other  direction  in  the  same  manner  for  a  resulting  two 
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h 


Figure  3-6  Two  Dimensional  Multiresolution  QMF 
dimensional  frequency  spectra  as  is  shown  above.  While  this  method  is  the  standard  image 
decomposition  used  in  much  of  wavelet  image  processing  the  Haar  basis  set  itself  has  many  draw¬ 
backs  in  the  localization  of  its  frequency  spectra  and  the  subsampling  process  decreases  the  inher¬ 
ent  resolution  of  the  image  decomposition^®. 

3.3  Existing  Fractal-Wavelet  Analog  Techniques 

Recently  a  number  of  researchers  including  Geoff  Davis  noted  that  the  Haar  wave¬ 
let  showed  a  direct  analog  in  the  fractal  compression  process^®’  As  a  result  the  method 
of  the  Haar  transform  has  recently  been  adapted  to  improve  the  fractal  coding  processs.  In 
addition  another  wavelet  compression  technique  has  emerged  which  like  the  fractal  uses 
cross  scale  redundancy  to  achiev  compression  This  method  Zero  Tree  Method  known 
as  the  zero  tree  was  developed  by  Shapiro  and  uses  a  wavelet  decomposition  similar  to 
the  QMF  Harr  decomposition  described  earlier.  These  techniques  also  give  insight  on  how 
to  design  a  better  compression  method  using  fractal  and  wavelet  principles. 
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3.3.1  Davis  SQS  Technique 

Geoff  Davis^®’  and  others  noted  that  the  wavelet  transform  could  mimic  the 
block  mapping  process  of  the  fractal  transform  and  developed  the  first  hybrid  transform. 
First  Davis  realized  that  the  averaging  process  in  range  to  domain  block  mapping  was  the 
low-pass  wavelet  of  the  Haar  transform.  He  also  noted  that  the  range  to  domain  mapping 
process  at  each  scale  could  be  made  significantly  faster  by  matching  the  high  pass  bands  to 
each  other  to  approximate  the  low  pass  to  high  pass  reconstruction  .  This  frequency  divi¬ 
sion  makes  sense  since  the  high  pass  must  be 
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3-7  Frequency  Analysis  of  Davis’  Technique 

approximated  upon  fractal  reconstruction  anyway,  so  the  frequency  division  process  of  the  wave¬ 
let  transform  simply  organizes  this  process.  Davis’  method  also  shows  outstanding  frequency 
compression  as  well.  Unfortunately,  the  complexity  of  the  compression  scheme  in  this  case  is 
high  since  there  are  multiple  high  pass  bands  to  map  from  high  pass  to  low  pass  on  encoding.  Also 


41 

Davis  does  not  take  into  account  the  implicit  relationships  between  shape  and  texture  In  addition 
this  technique  does  not  preserve  shapes  in  the  compressed  domain. 

3.3.1.1  Band  structure  mapping 

Figure  3-7  shows  how  the  local  IFS  of  the  hybrid  techniques  maps  Domain  to  Range 
Blocks  from  higher  to  lower  scales.  This  process  is  the  same  as  the  multifractal  except  the  map¬ 
ping  changes  from  scale  to  scale  since  the  wavelet  fractal  process  uses  a  local  rather  than  global 
iterated  function  system.  The  reconstruction  process  we  will  first  develop  then  uses  the  attractor 
model  as  in  the  multifractal  case  to  reconstruct.  Thus  we  have  a  linear  synthesis  and  nonlinear 
piecewise  reconstruction. 

As  is  also  shown  in  Figure  3-7  the  reconstructive  mapping  approximates  higher  frequency 
information  with  information  from  lower  scales.  This  process  thus  is  a  predictive  coding  tech¬ 
nique  from  low  to  high  frequency. 

3.3.2  Shapiro  Zero  Tree 

The  Shapiro  Zero  Tree  algorithm  has  been  called  “the  state  of  the  art  in  image  coders”. 
Essentially  zero-tree  uses  the  fact  that  power  across  frequency  scales  tends  to  naturally  decay  as 
one  increases  in  frequency  or  decreases  in  scale.  This  fact  allows  a  person  attempting  image  cod¬ 
ing  to  eliminate  information  in  image  at  a  particular  frequency  scale  and  all  scales  below  it  by 
determining  the  local  energy  in  a  block  in  one  of  the  high  pass  images.  If  the  energy  is  below  a 
certain  threshold  the  coefficients  at  that  scale  and  all  scales  below  it  are  not  coded. 
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3-8  Schapiro’s  Zero  Tree 

3.4  Biorthogonal  Spline 

The  Haar  in  the  applications  above  is  a  useful  structure  because  of  its  simplicity  in 
implementation.  However,  for  content  based  image  compression  we  look  to  a  different  wavelet 
stracture  to  directly  reveal  the  natural  shape  and  texture  composition  of  an  image.  This  structure 
is  the  biorthogonal  spline  wavelet. 

Unlike  the  orthogonal  structure,  the  biorthogonal  structure  reconstructing  wavelet 
is  not  the  complex  conjugate  of  the  analyzing  wavelet  but  a  totally  separate  function.  Such  bior¬ 
thogonal  structures  are  useful  in  maintaining  syihmetry  in  the  wavelet  filters.  Such  symmetry  is 

necessary  later  for  such  applications  as  the  edge  detection  routines  used  in  Mallat’s  decompo¬ 


sition  shown  later. 
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Figure  3-9  Biorthogonal  Filter  Bank  Structure 

3.4.1  Gaussian  Derivative  Basis 

A  particular  implementation  of  the  biorthogonal  spline  uses  the  Gaussian  deriva¬ 
tive  basis  set.  As  we  shall  show  later  the  Gaussian  derivative  has  many  desirable  properties  useful 
in  signal  and  image  analysis  including  an  ability  to  accurately  reveal  boundaries  and  edges  in  sig¬ 
nals  and  imagery  minimal  artifact  production  in  image  reproduction.  These  properties  will  be 
extremely  useful  in  designing  a  new  fractal  wavelet  method  of  image  analysis  and  synthesis  in 
Chapters  4  and  5. 

The  Mallat  transform  \i/(x)  is  constructed  in  a  very  different  manner  from  the 
QMF  Haar.  Start  with  a  gaussian  function  and  take  its  derivative  as  shown  in  the  following  fig¬ 
ures; 


Gaussian  Smoothing  Function  Gaussian  Derivative  Mother  Wavelet 


Figure  3-10  First  Derivative  Gaussian  Smoothing  Function 
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The  result  is  a  an  extremely  effective  means  of  detecting  singularities(edges)  in  a  signal.  If  we 
represent  our  gaussian 


2  >> 
X 


(3.4.1) 


then  our  smoothing  function  at  scale  Sq  is 


:e,  (X) 

^0 


where  we  assume: 


0  =  W 


(3.4.2) 


Our  defining  wavelet  is  thus 


\|r(x)  = 


(3.4.3) 


For  each  scale  we  have 


S 


(3.4.4) 


Our  1-D  wavelet  operator  is  thus: 


Wlf(x)  =f(8)V|rg(x) 


(3.4.5) 


3.4.2  Wavelet  Implementation 
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We  impose  that  the  Fourier  transform 


of  the  smoothing  function  <j)(x)  defined  by 


+  0O 

y=i 


(3.4.6) 


can  be  written  as  an  infinite  product 


$(w) 


-iwco 

e 


+  00 


n  ^(2 

p = 1 


(3.4.7) 


where  //(®)  is  a  2n  periodic  differentiable  function  such  that 


|//(®)|^  +  |H((0  +  JC)1^<1  and|//(0)|  =  1. 


(3.4.8) 


One  can  prove  that  the  conditions  eire  sufficient  so  that  the  above  equations  define  a  smoothing 
function  ^{x) ,  which  is  in  L^(R).  The  parameter  w  is  a  sampling  shift.  It  its  adjusted  in  order 
that  ^(x)  is  symmetrical  with  respect  to  0.  The  above  equations  imply  that 


^  ““iwco  ^ 

(t)(2(0)  =  e  //(co)<|)(©) 


(3.4.9) 


We  define  a  wavelet  \|/(x)  whose  Fourier  transform  \jlr(x)  is  given  by 
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V(2co)  =  e  ''^“g(0)^(o))  (3.4.10) 

where  G(a))  is  a  27t  periodic  function.  Let  us  impose  that  the  Fourier  transform  of  the  reconstmct- 
ing  wavelet  x{x)  can  be  written 

5C(2to)  =  e''^®/i:(©)^(a))  (3.4.11) 

From  the  above  equations  we  can  derive 

G(o))A:((0)  +  |/f(co)|^  =  1  (3.4.12) 

and  for  the  two  dimensional  transform 


L(a))  = 


1  +  |H((0)|^ 
2 


(3.4.13) 


We  want  a  wavelet  \|/(x)  equal  to  the  first  order  derivative  of  a  smoothing  function  9(a:)  . 
This  implies  that  \j/(x)  must  have  a  zero  of  order  1  at  co  =  0 .  We  choose  //(to)  in  order  that  a 
wavelet  \|/(x) ,  which  is  antisymetriced,  as  regular  as  possible ,  and  has  a  small  compact  support.  A 
family  of  2k  periodic  functions  that  satisfy  these  constraints  is  given  by. 


.  ico/2,  .  2n  + 1  /'i  A  A  A\ 

H{0))  =  e  (cos(co/2))  (3.4.14) 


^  . .  i(0/2  .  . 

G(q))  =  4ie  sm(co/2) 


(3.4.15) 


We  can  also  derive 
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sin((0/2)^2"+  1 

(0/2  J 


(3.4.16) 


and 


\j/((0)  = 


sin  ((0/4)^^^  +  2 
(0/4  j 


(3.4.17) 


The  Fourier  transform  of  Q(x)  of  the  primitive  is  therefore. 


§((0)  = 


sin((o/4)^^”  ^ 

00/4  J 


(3.4.18) 


The  overall  filter  bank  structure  using  the  above  wavelet  basis  is  shown  in  Figure  3-9  and 
is  again  based  on  Mallat’s  multiresolution  analysis. 


Figure  3-11  Multiresolution  Biorthogonal  Frequency  Decomposition 
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3.4.3  Fast  Implementation 

A  fast  iterative  method  for  the  above  wavelet  decomposition  is  described  below.  This  fast 
method  is  pivotal  in  ultimately  constructing  a  fast  multiresolution  decomposition  as  well  as  fast 
image  reconstruction  using  our  new  method  of  Chapter  5. 

We  suppose  that  the  wavelet  \|/(;c)  is  characterized  by  the  three  discrete  filters  H,  G,  and 
K  and  L.  We  denote  Hp,  Gp,  and  Kp  and  Lp.  the  discrete  filter  obtained  by  putting  2^-1  zeros 
between  consecutive  coefficients  of  the  filters.  We  also  denote  by  D  the  Dirac  filter  whose 
impulse  response  is  equal  to  1  at  0  and  0  otherwise.  We  denote  by  A*(H,L)  the  separable  convo¬ 
lution  of  the  rows  and  columns,  respectively  of  the  image  A  with  1-D  filters  H  and  L. 


j  =  0 

while  a  <J) 

f 

=  /.(0;.D) 

J  ^ 

(3.4.19a) 

J  ^ 

(3.4.19b) 

=  b  1 

Kj  2} 

(3.4.19c) 

j=j  +  i 

endwhile 

A  fast  iterative  reconstruction  process  is  also  described  below 
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j  =  J 

while  (j  > 


s  it,  f  =  X,.  w’’‘jf*(Ky+l,L;+l)  +  X,.- W V*  (Lj+ l,Kj+ 1).  + 

2-'  '  •'2-'  ■'2-' 


/»(H;+1,H;+1) 

2-' 


J=J-1 

endwhile 


(3.4.20) 


Specific  coefficients  for  the  implementation  of  these  filters  are  shown  in  the  fol¬ 
lowing  table: 


Finite  Implulse  Response  of  the  Filters  H,  G,  K  and  L  That  Correspond  to 
The  Quadratic  Spline  Wavelet 


n 

H 

G 

K 

L 

-3 

0.0078125 

0.0078125 

-2 

0.054685 

0.046875 

-1 

0.125 

0.171875 

0.1171875 

0 

0.375 

-2.0 

-0.171875 

0.65625 

1 

0.375 

2.0 

-  0.054685 

0.1171875 

2 

0.125 

-  0.0078125 

0.046875 

3 

0.0078125 

Table  3-1 
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3.4.4  2-D  representation 

In  contrast  to  the  Haar  QMF  orthogonal  2-D  decomposition  shown  in  section  3.2  Mal- 
lat  uses  a  more  elegant  technique  for  two  dimesional  decomposition  which  lends  itself  to  image 
compression  and  is  the  basis  of  the  Hybrid  Fractal  Wavelet  Method. 

The  algorithm  consists  of  first  preprocessing  an  image  with  a  multiscale  wavelet 

decomposition  as  described  by  Mallat  .  We  chose  the  number  of  scales  to  decompose  based  on 
the  resolution  of  the  image  we  wish  to  reproduce  and  the  amount  of  edge  information  we  wish  to 
save  (Fig.  8.)  Wavelet  encoding  in  2  dimensions  is  genetically  described  in  equations  3.4.21 


N  W  ^  W 

through  3.4.24. 

^^(x,y)  =  ^^(x,y) 

(3.4.21) 

«!-](«)  = 

2/  ^  1  2.x  w 

v/fcy)  =  (;.f) 

(3.4.22) 

s 

s 

Wlf{x,y)  = 

W^f(x,y)  =  /(8)\|/2(x,y 

(3.4.23) 

Wf  =  (Wl/(x,y), 

(3.4.24) 

4^^  and  'F^are  the  gradients  of  the  smoothing  function  0  in  the  x  and  the  y  directions 
respectively.  'Fj*  and  'Fg^are  the  gradients  at  each  scale  s  which  are  usually  power  of  2  in  the  spa¬ 
tial  X  and  y  directions.  Wg'  and  are  the  wavelet  transform  functions  in  the  x  and  y  directions. 


3.4.5  Modulus  Maxima 
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After  computing  the  wavelet  transform  we  compute  modulus  and  gradient  angle 
images  as  described  by  equations  3.3.25  and  3.3.26. 


(3.4.25) 

(3.4.26) 

The  difference  between  Mallat  in  his  two  dimensional  biorthogonal  transform  and 
the  ordinary  orthogonal  Haar  two  dimensional  transform  is  that  he  does  not  subsample  his  images 
as  in  the  1  dimesional  case  and  that  he  applies  only  one  filter  in  the  X  and  Y  directions  to  com¬ 
pute  a  polar  representation  for  modulus  maxima  for  each  of  the  high  pass  bands.  Thus,  Mallat  has 
only  two  high  pass  bands  which  can  be  represented  in  terms  of  x  and  y  or  polar  representation. 


MJ{x,y)  = 

A-J{x,y)  =  argiWjf(x,y)  +  iW'^f(x,y)) 
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Figure  3-12  Modulus  Maxima 

It  is  also  interesting  to  note  as  shown  in  Figure  3-13  that  the  biorthogonal  spline 
has  a  much  more  accurate  frequency  response  than  the  Haar  basis  set.  This  fact  is  extremely 
important  in  the  wavelet  reconstruction  process  since  the  Haar  basis  contains  many  high  fre¬ 
quency  artifacts  in  reproduced  images  due  to  its  sharp  spatial  domain  cutoff.  The  Gaussian  deriv¬ 
ative  spline  has  a  much  smoother  spatial  cutoff  and  thus  much  less  tendency  to  create  artifacts  in 
imagery,  as  is  shown  in  the  frequency  plots  below; 


2Q  40  eo  20  40  60 

Biorthogonal  Spline  Frequency  Profile  Biorthogonal  Spline  Frequency  Profile 

Low  Pass  H  Filters  Scale  1-6  1-D  G  Filters  Scales  1-6  1-D 

1 

0.S 
0 

20  40  60 

Haar  Frequency  Profile 
Low  Pass  H  Filters  Scales  Sctdes  1-6  1-D 

Figure  3-13  Mallat’s  Gaussian  Derivative  vs.  Harr 

3.4.6  Mallat  Technique 

In  a  technique  developed  by  Stephan  Mallat  and  Sifen  Zhong,'^  they  use  the  modulus 
maxima  edge  information  from  the  2  dimensional  biorthogonal  spline  wavelet  described  above  to 
reconstruct  the  original  image.  Thus  modulus  maxima  lines  are  chain  coded  and  only  those  with 
intensity  above  a  certain  threshold  are  store  as  in  the  Zero  Tree  example. 


Alternating  Projections/  Edge  Relaxation 
to  Reconctruct  Image  at  Each  Scale 


/xy> 


Figure  3-14  Alternating  Projections  Reconstruction 
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The  synthesis  process  for  reconstructions  involves  a  complicated  alternating  projection  process 
which  is  similar  to  nonlinear  edge  relaxation.  While  the  edge  information  has  great  use  in  analyz¬ 
ing  shapes  in  imagery  the  reconstruction  process  can  be  somewhat  time  consuming  and  also  has  a 
tendency  to  remove  image  texture  since  it  is  essentially  a  nonlinear  interpolation  process. 


55 


Chapter  4 

Multifractal  Model 
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4.1  Multifractal  1-D  Signal  Decomposition/Reconstruction 

In  order  to  define  our  compression  technique  in  terms  of  the  images  natural  shape 
and  texture  we  turn  to  an  approach  to  signal  decomposition  known  as  the  multifractal  method. 
Recently  the  multifractal  model  has  emerged  as  a  defining  connection  between  wavelets  and  frac¬ 
tals.  The  multifractal  in  most  implementations  uses  Mallat’s  MRA  with  biorthogonal  spline 
wavelet  decomposition  to  analyze  a  signal  and  a  global  iterated  function  system  invariant  measure 
algorithm  to  reconstruct  the  signal.  We  will  first  derive  how  the  multifractal  connects  both  wavelet 
and  fractal  models.  We  will  then  show  how  parameters  from  this  technique  may  be  used  to 
describe  the  natural  fractal  shape  and  texture  of  an  object.  Using  this  connection  we  will  ulti¬ 
mately  develop  the  wavelet-fractal  compression  method. 

To  understand  the  multifractal  concept  we  first  give  an  example  of  an  invariant 
measure  one  dimensional  iterated  function  system.  This  will  serve  as  the  basis  for  synthesis  of  any 
arbitrary  signal.  This  method  is  a  mathematical  example  of  the  method  described  in  section 


2.2.1. 1. 


4.1.1  An  Invariant  Measure  Example 
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The  contractions  Wj,W2 . ,Wj^  and  the  probabilities  Pi,P2 . jPn  determine 

how  frequently  a  certain  pixel  Pjj  will  be  hit  by  the  chaos  game.  The  average  fraction 


^(^1, . ^k^Pij) 

k-^oo  k  ‘j 


(4.1.1) 


if  the  result  of  a  particular  measure  p,  which  has  the  attractor  A„ ,  the  attractor  of  the  IPS  as  its 
support  (i.e.  p(A^)  =  1 ).  In  other  words. 


=  R-j  (4.1.2) 

This  measure  p  is  Borel  measure  and  is  invariant  under  the  Markov  operator  M(v)  which  is 
defined  as  mentioned  in  chapter  1  with  slightly  different  notation.  Let  X  be  a  large  square  in  the 
plane  which  contains  ,  the  attractor  of  the  IPS,  and  v  a  (Borel)  measure  on  X.  Then  this  oper¬ 
ator  is  defined  by 

Af(v)  =  PiVw;’  +P2VW2'  +  •••  +p^vw^*  (4.1.3) 

In  other  words,  M(v)  defines  a  new  normalized  Borel  measure  on  X.  We  evaluate  this  measure 
for  a  given  subset  B  in  the  following  way:  first  we  take  the  preimages  wT*  (B)  with  respect  to  X , 

then  evaluate  v  on  that  and  finally  we  multiply  the  probabilities  pj  and  add  up  the  results.  Here  is 
an  example.  Let 

wi(x)  =  {l/2)x,  =  1/3  (4.1.4) 


W2(x)  =  l/2x+  1/2, 


Pi  =  2/3 


(4.1.5) 
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This  is  an  IPS  which  has  the  unit  interval  as  its  attractor  =  [0,  1  ] .  Now  assume  that  we  start 
with  a  measure  given  by  the  density: 


1 1  if  x  €  [0, 1] 
[  0  otherwise 


(4.1.6) 


ho 


0 


1 


h? 

n 

-rrfHT- 

0  1 


Figure  4-1  Invariant  Measure  Example 

i.e.,  the  initial  measure  is  Vq  =  J hQ{x)dx .  For  a  subset  A  c  [0, 1/2]  of  the  left  half  unit  interval 

A 

wehave  w'2‘(A)c[-1,0]  and  VqCw^^A))  =  0.  ThusVi(A)  =  VoCw^Va)).  A  correspond- 
ing  argument  holds  for  the  right  half  interval  [1,1/2]  while  W2  does  the  same  with  the  right  half 
interval  multiplying  the  result  by  p2  .  Thus  after  the  first  step  we  obtain  the  density 
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h^ix) 


>1  ifxG  [0,1/2] 
^  P2  if  xe  [1/2,0] 
0  otherwise 


(4.1.7) 


and  Vj(A)  =  ^h^{x)dx  .  We  construct  a  measure v 2  =  M{y{^  along  the  same  lines  and 

A 

obtain  the  density  function  h2  as  shown  in  the  above  figure.  In  the  limit  this  process  generates  a 
binomial  measure  that  is  a  self-similar  multifractal  measure  In  the  next  section  we  will  describe  a 
new  selection  of  functional  equations  incorporating  the  wavelet  from  which  we  will  be  able  to 
reveal  the  properties  of  an  invariant  measure  and  synthesize  any  arbitrary  signal. 

4.1.2  Self  Similar  Wavelet  Functional  Equation 

To  determine  the  invariant  measure  implicit  in  a  given  signal  we  have  a  method  of 
interpreting  its  essential  temporal  or  spatial  structure.  This  structure  is  defined  by  the  singularities 

or  edges  of  the  signal.  Thus,  to  detect  singularities  in  the  original  signal  we  can  find  the  points 
within  the  signal  where  most  energy  is  localized  using  the  biorthogonal  spline  approach  with  Mal- 
lat’s  MRA.  The  next  task  is  to  find  a  mapping  or  set  of  iterated  function  system  equations  that 
characterizes  the  original  signal. 

Note  that  the  mapping  proceeds  from  the  lowest  frequency  band  to  the  next  higher 

band  of  frequencies.  *  The  goal  of  the  multifractal  process  is  to  use  the  wavelet  transform  to 
reveal  the  invariant  measure  parameters  of  probability  p ,  translation  r,  and  scale  1.  If  we  have  a 
self  similar  signal /(x)  which  can  be  approximated  the  invariant  measure  parameters  as  follows: 

f(x)  =  plfils,(x-r))  (4.1.8) 

We  can  show  that  by  taking  the  wavelet  transform  of  4.1.8  that  these  these  parameters  naturally 
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fall  out  of  the  wavelet  transform  as  follows: 


Wfis,x)  =  plWf{ls,l{x-r))  (4.1.9) 

Or  more  importantly  taking  the  wavelet  maxima 

Mf(s,x)  =  plMf(ls,l(x-r))  (4.1.10) 


By  using  the  wavelet  maxima  singularity  spectra  shown  in  Fig  4-2c  this  process  can  be  greatly 
simplified  since  we  have  separated  the  original  signal  into  frequency  bands  and  then  reduced  the 
information  within  these  bands  to  an  elementary  set  of  data,  namely  the  maxima  themselves.  We 
can  characterize  the  original  equations  that  resulted  in  figure  4-2c  by  a  histogram  voting  proce¬ 
dure  that  records  the  following  quantities  1,  p  and  r  directly  from  the  maxima  representation.  1  as 
indicated  in  4. 1.10  characterizes  the  contractivity  factor  or  spacing  between  separate  wavelet 
scales 


log/  =  log 


(4.1.11) 


p  is  probability  of  the  invariant  measure  and  characterizes  the  decay  of  energy  across  frequency 
bands. 


log/7  =  log 


Jm^), 


(4.1.12) 


r  is  the  geometrical  displacement  or  translation  between  scales 


61 


(4.1.13) 


4.1.3  Analysis  with  wavelet  ->  Reconstruction  with  fractal 

The  process  of  multifractal  analysis  and  construction  amounts  to  decomposition  with  the 
wavelet  and  invariant  measure  or  ifs  reconstruction.  To  detect  singularities  we  use  a  wavelet 
basis  function  in  the  time  domain  which  follows  the  above  frequency  structure  but  in  the  con¬ 
volution  process  detects  singularities  or  “edges”  in  imagery.  Such  a  basis  set  might  be  the 

derivative  of  a  Gaussian  function  ^  This  process  is  demonstrated  in  the  following  three  fig- 
ures.Figure  4-2a  shows  the  Devils  Staircase  in  1  Dimension  created  with  the  functional  equa¬ 
tions  of  section  4. 1 .2  characterized  by  the  multifractal  parameters:  li=2, 12=4,  pi=0.66, 
P2=0.33,  rj=-.25,  r2=0.37 


Figure  4-2a  Devils  Staircase  Original  Function 
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Figure  4-3  Multifractal  Parameter  Analysis 
Figures  4-3  shows  the  ratios  of  the  spacing  between  maxima  lines  in  figure  4-2c  of 

the  devils  staircase  followed  by  histograms  of  the  number  of  maxima  lines  which  fall  into  each 
category.  As  we  can  see  from  histograms  of  the  modulus  maxima  graphs  we  are  able  to  derive 
the  exact  parameters  of  the  functional  equations  that  created  the  Devils  Staircase  of  Figure  4-2a. 
Thus  we  have  a  means  of  breaking  apart  a  given  one  dimensional  signal  into  its  elementary  frac¬ 
tal  structure  and  then  reconstructing  it  using  iterated  function  system  equations.  Needless  to  say, 
this  method  is  an  extremely  effective  compression  method  since  it  uses  the  innate  mathematical 
structure  of  a  fractal  to  compress  It  also  shows  the  implicit  connection  between  wavelets  and 
fractals.  Unfortunately  this  method  is  much  more  challenging  in  two  dimensions.  We  will  now 
discuss  a  methodology  for  approaching  this  “inverse  fractal”  problem  in  two  dimensions. 
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4.1.4  Two  Dimensional  Global  IFS 


1  1 

In  two  dimensions  the  iterated  function  system  ’  equations  have  both  x  and  y 
components  for  the  reconstruction.  Such  functional  equations  take  a  somewhat  different  form 
since  x  and  y  axis  can  be  coupled.  Simplify  the  process  we  do  not  couple  the  axis  and  we  thus  get 
the  two  following  equations  as  in  the  previous  section  and  also  assuming  that  we  can  approximate 
our  signals  f(x)  and  f(y)  with  a  summation  of  versions  of  itself  we  have: 


/  =  1 


/(}')=  Y^Pirnifimiiy-Si))  (4.1.15) 

I  =  1 


A  set  of  the  above  equations  applied  with  probabilities  p  can  be  used  to  generate  an  arbitrary  two 
dimensional  function  as  was  done  with  the  devils  staircase  in  1  dimension.  Now  setting  lp.5, 
l2=.5,  l3=.5,  mi=0.5,  m2=0.5,  m3=0.5,  rpO.O,  r2=0.5,  r3=0.0,  s,=0.0,  S2=0.0,  S2=0.5,  pi=0.33, 
P2=0.33,  P3=0.33  we  produce  the  Serpinski  triangle  shown  in  Figure  4-4 
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Figure  4-4  Sierpinsky  Triangle 

Unfortunately  the  task  of  finding  the  two  dimensional  functional  analysis  parameters  is 
not  as  easy  as  for  the  1  dimensional  case  since  the  extra  dimension  adds  a  geometrical  order  of 
magnitude  of  complexity.  We  now  compare  the  implementation  of  multifractal  global  IPS  to  local 
IFS. 

4.2  Two  Dimensional  Shape  and  Texture  Analysis 

Now  with  an  understanding  of  the  multifractal  concept  we  turn  to  analysis  of  shape  and 
texture  in  imagery.  First  we  will  examine  how  the  two  dimensional  singularity  spectra  reveals  the 
shape  of  objects  in  imagery.  Then  we  will  apply  this  singularity  spectra  to  texture  and  show  how 
the  concepts  relate  to  the  local  IFS  model  to  ultimately  improve  the  image  compression  process. 
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4.2.1  2-D  Modulus  Maxima  Reveals  Shape 

The  Mallat  basis  set  is  based  on  an  approximation  of  the  first  derivative  of  a  gaus- 

sian.  Such  a  filter  is  shown  to  have  very  accurate  edge  detecting  capabilities.  In  other  words  the 
edges  detected  at  each  scale  are  not  significantly  displaced  from  where  the  edge  is  physically 
located  in  an  image.  Such  a  property  is  very  useful  for  image  reconstruction  as  will  be  shown 
later.  The  biorthogonality  property  of  the  Mallat  Transform  significantly  helps  this  process  since 
it  allows  symmetric  basis  functions  which  also  produce  accurate  edges.  The  Gaussian  is  also 
known  to  produce  edges  which  are  consistent  and  do  not  represent  false  edge  points.  Also  the  first 
derivative  only  produces  one  edge  response  per  edges  unlike  higher  order  derivatives  that  produce 
multiple  responses. 

The  zero  crossing  representation  produced  by  the  first  derivative  of  the  Gaussian  is 
significant  because  it  represents  boundaries  in  2  dimensional  signals  between  different  intensities. 
These  boundaries  are  revealed  by  the  modulus  maxima  edges.  This  modulus  maxima  edge 
method  is  among  the  most  reliable  way  to  detect  edges  since  edges  that  might  otherwise  end  due 
to  low  intensity  can  be  regularized  by  chain  coding  and  finding  continuation  of  edges  based  on 
edge  gradient  values.  Thus  Mallat  is  able  to  exploit  this  fact  for  greater  localization  of  edges 
across  scales.  This  localization  reveals  the  inherent  self-similar  structure  in  an  image  and  our  goal 
in  the  new  wavelet  fractal  method  is  to  exploit  the  modulus  maxima  representation  to  simplify  the 
fractal  compression  process  in  a  similar  manner  to  what  was  achieved  in  the  1 -dimensional  multi¬ 


fractal  case. 
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4-5  Gradient-Based  Chain  Coding 

4.2.2  Lipschitz  Alpha  to  Characterize  Edges 

To  positively  identify  edges,  Mallat  makes  use  of  the  multiple  frequency  scales 

to  see  edge  patterns  over  a  range  wavelet  frequency  bands  .  If  these  edge  patterns  appear  over 
multiple  frequency  bands  then  an  edge  is  positively  identified.  To  characterize  the  self  similarity 
between  scales  Mallat  uses  the  Lipschitz  criteria  as  described  in  equation  4.2.1.  If  we  select  some 
a  where  0  <  a  <  1  and  the  function  f(x,y)  is  uniformly  Lipschitz  over  an  open  set  of  reals  if  there 
exists  a  constant  K  such  that  for  all  points  (x,y)  of  this  open  set 

M  ./(x,y)<K(2-'’)“  (4.2.1) 

2-’ 

Essentially  the  a  criterion  measures  the  intensity  of  the  wavelet  modulus  as  one 
progresses  to  successively  higher  scales  (lower  frequencies).  If  in  a  two  dimensional  signal  an 
object  has  a  small  exponent,  the  intensity  of  the  wavelet  modulus  maxima  stays  relatively  con- 

stant  over  a  number  of  scales  and  we  have  essentially  a  ‘hard  edge’  which  stays  the  same  from 
scale  to  scale  whereas  higher  a  indicate  softer  edges.  Thus  in  2-D  low  a  can  be  used  to  character- 
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ize  hard  geometric  objects  in  scenes  which  is  an  extremely  useful  property  in  removing  noise 
from  objects.  Also  low  a  indicates  occlusions  where  there  are  multiple  objects  in  a  scene  .  Thus 
along  modulus  lines  with  low  a  we  can  separate  two  objects  for  identifying  separate  shapes. 
Because  of  the  denoising  capability  we  can  also  tune  our  algorithm  to  preserve  only  features  of 
interest. 

In  terms  of  our  multifractal  parameters  we  can  define  a  with 


a  =  - 


logp 

log/ 


(4.2.2) 


4.2.3  Fractal  Dimension 

The  Lipschitz  alpha  has  been  related  directly  to  what  is  known  as  fractal 

dimension^®.  Pentland  ^'^and  others  have  shown  that  fractal  dimension  has  shown  promise  in 
characterizing  the  texture  natural  objects  from  man-made  and  others  have  used  fractal  dimension 
to  characterize  the  standard  Broadatz  texture  classes.  Unfortunately  such  texture  characterization 
is  arbitrary  since  it  did  not  have  a  precise  mathematical  method  but  is  instead  discussed  in  terms 
of  box  counting  dimension.  Thus  we  must  find  another  means  of  characterizing  fractal  dimen¬ 
sion. 


As  it  happens,  the  Lipschitz  alpha  is  a  difficult  quantity  to  calculate  experimen¬ 
tally  for  characterizing  texture  since  it  does  not  characterize  individual  points  in  an  image  but  sets 
of  points  instead.  To  characterize  individual  singularity  points  in  an  image  we  turn  instead  to  the 
Holder  exponent.  The  Holder  exponent  h(x)  of  a  distribution  /at  the  point  xq  is  defined  as  the 
greatest  h  so  that/is  Lipschitz  h  at  Xq,  i.e.,  there  exists  a  constant  C  and  polynomial  Pn(x)  of 
order  n  so  that  for  all  x  in  a  neighborhood  of  Xq,  we  have 
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\fix)-P^(x-XQ)\<C\x-XQ\''  (4.2.3) 

4.2.4  Holder  Exponent  Revealed  by  the  Wavelet  Transform 

We  can  use  the  wavelet  transform  to  reveal  the  Holder  exponent  in  a  signal.  Differ¬ 
ent  wavelet  basis  sets  have  different  ranges  of  permissible  Holder  exponents  which  they  can 

reveal.  To  study  this  phenomenon  as  described  by  Ameodo^  we  slightly  redefine  the  wavelet 
transform  where 


(4.2.4) 

where  a  is  a  scale  parameter  a  e  and  b  is  a  real  valued  space  or  time  parameter.  We  now 
define 

1 

a)  =  (4.2.5) 

where  (•l•)£^2^gJ  is  the  scalar  product  in  L^(SR,  dx)  .  We  can  then  derive  that  the  local  singular 

behavior  C\x  -  Xq]  of/around  the  point  x  =  xq  when  the  scale  a  goes  to  0  for  a  given  Holder 

exponent  h(x).  On  the  other  hand ,  if /were  C”  at  Xq  on  could  prove  that  we  would  get  a  power 
law  scaling  exponent  n^>n  . 


In  other  words,  the  maximum  Holder  exponent  that  one  can  achieve  in  a  singularity  spec¬ 
tra  is  equal  to  the  number  of  vanishing  moments  of  the  analyzing  wavelet.  Thus  for  the  first  deriv¬ 
ative  of  the  gaussian  as  well  as  the  Harr  wavelet,  both  of  which  have  one  vanishing  moment,  we 
can  achieve  Holder  exponents  which  are  between  -t-Z-l. 

4.3  2-D  Singularity  Characterization  for  Texture 

Figure  4-6  shows  the  process  of  computing  the  Holder  exponent  on  two  dimensional 
images.  This  exponent  is  first  computed  by  calculating  the  gaussian  derivative  decomposition  of 
the  image  and  then  taking  corresponding  pixels  across  high  pass  images  between  successive 
scales.  For  each  pixel  we  compute  a  graph  of  the  slope  of  the  wavelet  coefficient  magnitudes 
across  scales.  When  the  log  (base  2)  of  the  slope  across  scales  is  computed,  each  resulting  slope 


is  entered  into  a  new  image  at  the  pixel  position  it  represents. 


Mallat’s  Dyadic 
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The  result  is  called  the  singularity  spectra.  An  example  of  the  singularity  spectra  is  computed  for 
the  star  image  shown  in  Figure  4-7. 

The  result  is  that  specific  region  extraction  and  object  detection  can  be  done  with  the  sin¬ 
gularity  spectra  by  specifying  Holder  exponents  in  certain  ranges.  For  instance.  Holder  exponents 
with  negative  values  correspond  to  hard  edges  such  as  the  edges  of  buildings  in  an  image  whereas 
positive  Holder  exponents  correspond  to  naturally  occurring  phenomenon  such  as  vegetatation,  or 
cloud  formation. 

In  the  image  of  Figure  4-7  we  are  able  to  extract  the  shape  of  a  planet  from  a  very  noisy 
saturated  background  simply  by  the  fact  that  the  planet  has  very  hard  fixed  edges  in  the  range 
from  -.21  to  -.2.  If  we  were  to  use  a  different  basis  set  having  two  vanishing  moments  such  as  the 
mexican  hat  function,  we  could  reveal  in  a  broader  range  between  -2  to  2.  However,  this 
basis  set  also  reveals  many  other  unwanted  features  in  the  image  for  a  similar  alpha  range 
so  its  detection  capabilities  are  severely  limited.  The  singularity  spectra  are  an  extremely  use¬ 
ful  tool  in  many  applications  including  synthetic  aperature  radar  analysis,  downlooking  sat¬ 
ellite  surveillance,  and  texture  analysis.  Recently  similar  analysis  has  been  performed  to 


characterize  Broadatz  texture  classes  as  well. 


Original  Planet  Image  in  Noisy  Background 


Planet  Detected  in  Alpha  Range  -.21  to  -.2 


Figure  4-7  Planet  Detected  in  Singularity  Spectra 
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4.3  Analogy  of  Multifractal  Global  IFS  to  Local  IFS 

Now  with  the  concept  shape  and  texture  as  they  apply  to  the  multifractal  we  may  now  bet¬ 
ter  understand  some  of  the  concepts  of  the  local  ifs  which  has  been  used  to  compress  imagery. 
There  are  essentially  three  parameters  needed  for  local  fractal  reconstruction  as  was  discussed  in 
Jaquin  and  Fischers’s  techniques  and  indicated  in  Equation  2.3.1  indicated  below  as  Equation 
4.1.1. 

fix)  =  Tfix)  =  U^^f{x)  +  b  =  Q^fi2x)  +  b  (4.3.1) 

These  parameters  are  spatial  location  and  orientation  symbolized  by  the  spatial  transfor¬ 
mation  Ql  analogous  to  the  displacement  parameters  r  and  s  in  the  multifractal  case  of  equations 
4.1.14  and  4.1.15.  Note,  however  that  the  local  IFS  Ql  represents  a  block  based  operation 
between  two  scales  where  the  r  and  s  parameters  represent  a  point  operation  between  all  scales. 
Secondly  there  is  b  the  intensity  transformation  from  range  to  domain  block,  analogous  to  the 
probabilty  p  in  the  1  dimensional  multifractal  case.  Again  realized  that  b  is  an  intensity  transfor¬ 
mation  between  two  scales  where  p  is  an  intensity  transformation  associated  with  r  and  s  between 
all  scales.  Finally,  the  scale  separation  parameters,  1  and  m  are  a  power  of  2  since  we  are  using 
Mallat’s  dyadic  decomposition. 

Note  that  the  local  IFS  technique  has  more  flexibility  than  the  global  case  since  the  map¬ 
ping  between  frequency  scales  can  be  arbitrary  where  the  global  case  has  a  fixed  relationship 
between  all  scales.  Thus  the  global  case  applies  to  only  certain  naturally  occurring  fractal  objects 
such  as  the  IFS  fern  or  Sierpinsky  Triangle.  Note,  as  previously  stated,  it  is  not  feasible  to  draw  a 
direct  mathematical  relationship  between  the  multifractal  and  local  IFS  parameters  since  the  mul¬ 
tifractal  parameters  represent  pointwise  transformations  and  the  global  ifs  parameters  represent 
block  transformations.  Nonetheless,  the  analogy  will  be  useful  when  studying  shape  and  texture 


relationships  later  in  this  Chapter  5. 


Local  IFS  Mapping 


Global  IFS  Multifractal 


the  multi-fractal  model.  Recalling  that  the  multifractal  maps  the  lower  frequency  information  to 
reproduce  the  higher  frequency  information  in  an  image,  lets  assume  we  have  an  object  with  a 
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given  shape  outline  in  an  image.  This  object  may  or  may  not  be  generated  by  a  fractal  process. 
Whatever  the  case,  the  object  may  also  have  some  form  of  texture  associated  with  it.  If  we  assume 
an  implicit  multifractal  equations  associated  with  the  object,  then  the  gross  edge  outlines  of  the 
object  are  iteratively  mapped  back  onto  the  object  to  reproduce  its  internal  detail  and  texture.  If 
so,  the  wavelet  will  will  have  a  unique  mapping  from  shape  to  texture.  If  the  object  is  not  fractal 
then  the  mapping  will  be  an  approximation.  Thus,  we  are  now  prepared  to  develop  a  fractal  wave¬ 
let  method  which  has  the  basic  elements  of  shape  and  texture  as  its  fundamental  components. 
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The  basic  principle  of  the  fractal,  as  we  have  seen  in  preceding  sections,  is  that  of  an 
object  whose  shape  resembles  that  of  its  smallest  component  and  there  is  a  spatial  mapping  which 
defines  the  relationship  between  the  objects  overall  shape  to  the  smallest  component  shape.  Now 
with  Mallat’s  modulus  maxima  technique  we  have  a  means  of  defining  shape  at  any  given  scale. 
The  multifractal  mapping  represents  the  transition  from  an  objects  shape  to  its  texture.  This  tex¬ 
ture  can  be  defined  in  terms  of  fractal  dimension  which  in  wavelet  theory  is  the  decay  in  energy 
of  wavelet  coefficients  across  scales.  With  this  conceptual  model  we  define  the  wavelet  fractal 
method  of  image  compression. 

The  first  thing  to  remember  when  designing  a  fractal  compression  algorithm  is  that  few 
objects  that  occur  in  imagery  are  generated  by  a  natural  set  of  fractal  equations.  Thus  we  are 
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forced  to  approximate  objects  in  imagery  with  the  fractal  model.  For  this  reason  we  will  choose 
the  local  iterated  function  system  to  reproduce  a  given  part  of  an  image  rather  than  a  global  iter¬ 
ated  function  system  since  the  local  iterated  functions  system  has  more  flexibility  in  its  construc¬ 
tion. 

Looking  at  all  models  of  fractal  encoding  we  realize  that  the  trend  is  to  reconstruct  an 
object  from  its  low  frequency  components  to  its  high  frequency  components.  In  the  wavelet  frac¬ 
tal  case  this  method  approximates  wavelet  coefficients  across  scales.  This  model  fits  with  the  frac¬ 
tal  method  since  low  frequency  corresponds  to  large  scale.  We  also  know  that  the  Mallat 
multiresolution  decomposition  happens  in  diadic  scales  which,  in  the  Haar  basis  set  case,  corre¬ 
sponds  to  blocks  which  are  dimensions  are  powers  of  2  in  size.  Thus  we  shall  keep  with  this 
framework  for  the  Gaussian  derivative  basis  set.  Putting  this  in  the  context  of  the  wavelet  trans¬ 
form  we  re-write  equation  2.3.1  by  inserting  equation  3.4.24  as; 

W^j_^f(x,y)  =  QjW^jfix,y)  +  b  (5.1.1) 

This  equation  is  analogous  to  equation  4.1.9  for  the  global  ifs  multifractal  case.  Thus  we  build  our 
reconstructed  image  from  the  low  frequency  or  large  scale  images  first  and  then  eventually  recon¬ 
struct  the  final  image.  Note  that  his  process  use  the  wavelet  decomposition  to  explicitly  separate 
scales  by  frequency.  Thus  for  a  given  block  size  we  only  have  information  that  is  fits  that  particu¬ 
lar  scale. 

5.1.1  Modulus  Maxima  Decomposition 

The  first  step  in  the  wavelet  fractal  method  is  to  reduce  a  given  image  by  the  Gaussian 
Derivative  modulus  maxima  technique  and  obtain  the  wavelet  scales  W  shown  in  equation  5.1.1. 
We  recall  that  the  gaussian  derivative  modulus  maxima  technique  in  two  dimensions  decomposes 
an  image  into  three  parts.  A  lowpass  image  defined  be  equation  3.4. 19c  as 
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/  (^,  y) 


the  modulus  maxima  highpass  magnitude  M^f{x,y)  and  argument 


images  A  f(x,y)  .  The  results  of  this  decomposition  are  shown  for  three  scales  on  Lena 

below.  It  is  important  to  note  that  in  the  Mallat  decomposition  that  the  scales  are  not  subsampled 
as  in  standard  wavelet  decomposition  so  for  range  to  domain  block  matching  we  subsample  the 
higher  scale  by  2  as  in  standard  fractal  compression  with  no  averaging  process  since  each  scale  is 
naturally  filtered  by  the  wavelet. 

Another  by  product  of  the  Gaussian  derivative  wavelet  decomposition  is  that  it  natu¬ 
rally  organizes  the  features  in  an  image  for  matching  thereby  restricting  the  domain  to  range 
blocks  that  get  chosen  for  a  particular  scale,  and  thus  makes  the  domain  to  range  block  matching 
process  both  accurate  and  fast.  This  is  a  significant  improvement  of  Jaquin  and  Fischer’s  tech¬ 
niques  of  fractal  compression. 
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Original  Image 


Lowpass 


Highpass  Maxima  Highpass  Argument 


5-1  Modulus  Maxima  Decomposition  Scales  1-3 
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5.1.2  Domain  to  Range  Block  Matching 

To  find  a  mapping  Ql  from  shape  to  texture  for  a  particular  object  we  need  an  efficient 

mechanism  for  determining  a  mapping.  If  we  recall  from  section  2.3  range  and  domain  blocks  in 
traditional  fractal  encoding  are  obtained  by  subsampling  the  image  and  then  matching  each  range 
block  to  every  possible  domain  block  in  an  image.  Needless  to  say  this  encoding  process  takes  an 
extremely  long  time  and  is  one  of  the  major  drawbacks  of  traditional  fractal  encoding. 

5.1.2.1  Range  to  Domain  Localization  Based  on  Shape 

If  we  recall  the  concept  of  a  fractal  we  remember  that  it  is  the  same  shape  no  mat¬ 
ter  what  scale  the  user  observes.  Looking  at  the  2-dimensional  Gaussian  derivative  basis  set  we 
see  that  the  modulus  maxima  lines  give  us  a  natural  indication  of  the  object’s  shape  as  was  indi¬ 
cated  in  Chapter  4.  Thus,  if  we  have  a  natural  fractal  object,  we  should  be  able  to  map  the  shape 
of  the  object  into  its  smaller  details  via  our  local  ifs  block  transformation  process.  This  mapping 
process  also  typically  follows  the  natural  cone  of  influence  of  the  wavelet  scale  decomposition  of 

objects.  We  will  see  how  this  process  also  speeds  up  encoding  since  it  reduces  the  domain 
block  pool  and  thus  the  matching  time. 

To  begin  the  object  based  compression  process,  we  must  determine  the  boundaries 
and  interior  of  an  object.  The  modulus  maxima  values  provide  the  natural  boundaries  of  objects 
in  a  given  scene  for  a  particular  scale.  After  choosing  a  scale  at  which  the  object  of  interest  is 
located  we  take  the  modulus  maxima  values  associated  with  that  object  and  chain  code  them 

together  based  on  their  magnitude  and  gradient  direction.^^  Once  we  have  a  set  of  modulus  max¬ 
ima  values  associated  with  a  particular  object  we  determine  if  these  values  form  a  closed  curve.  If 
they  do  we  fill  the  interior  and  this  becomes  an  object  mask.  If  not  we  connect  the  ends  of  the 
curve  with  a  straight  line  and  then  fill  the  interior  of  this  curve  for  the  mask.  The  process  is  shown 


83 


below  in  Figure  5-1. 


Object  Mask  Scenarios 
Closed  Curve 


Open  Curve 


Artificially  Close  Curve 


Line  Segment 
Between  Ends 
of  Open  Curve 


Figure  5-2  Object  Shape  Mask 

If  we  encode  using  the  object  shape  mask  we  select  the  range  and  domain  blocks 
associated  with  that  particular  mask.  This  process  makes  intuitive  sense,  particularly  if  the  object 
is  generated  by  a  self  similar  process  such  as  a  fern  or  a  cloud.  The  modulus  maxima  mapping 
process  in  the  case  of  a  natural  fractal  would  simply  define  how  the  object  iteratively  maps  onto 
itself.  Unfortunately  since  we  are  dealing  with  non-fractal  objects  in  most  case,  and  our  block 
mapping  procedure  is  restricted  to  dyadic  scales  with  linear,  square  block  transformations  we  can 
only  approximate  most  natural  fractals.  Thus  the  block  mapping  process  is  at  best  a  compromise 
between  ease  of  implementation  and  exact  representation. 

An  additional  advantage  of  the  self  encoded  object  is  that  we  can  decode  it  individ¬ 
ually  apart  from  the  overall  image.  This  fact  will  later  be  extremely  useful  for  image  and  video 
editing  purposes.  Also  it  offers  the  new  possibility  of  object  scalablity  to  reduce  the  overall  bit 
rate  of  data  transmission  for  a  compressed  file. 

In  some  cases  when  objects  are  inherently  non  self  similar  restricting  the  domain 
block  pool  can  result  in  reduced  reproduction  quality  of  an  object.  In  the  case  where  the  best  pos¬ 
sible  accuracy  is  required  we  can  still  fall  back  on  selecting  range  and  domain  blocks  from  any 
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part  of  the  image  but  in  most  cases  this  is  not  necessary.  The  object  domain  pool  often  gives 
results  as  good  as  those  for  the  domain  pool  of  the  entire  image.  It  also  turns  out  that  we  can  still 
save  on  bits  in  our  encoded  representation  of  the  image  with  the  b  offset  parameter  since  in  the 
case  of  on  object  with  constant  texture  the  mapping  between  scales  is  the  same  and  within  an 
object  we  select  blocks  which  have  approximately  the  same  b  parameter.  Also  by  having  range 
and  domain  blocks  naturally  organized  by  modulus  maxima  lines  we  have  a  natural  means  orga¬ 
nizing  compressed  information  by  shape  and  as  we  shall  see  texture.  This  fact  is  a  significant 
improvement  over  the  zero-tree  method  since  it  combines  shape  information  as  a  natural  part  of 
the  compression  process.  Examples  of  self  encoded  objects  are  featured  in  Appendix  A. 
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Self  Encoded  Object  Across  Scales 


5-3  Range  to  Domain  Self  Mapping 

Thus,  a  direct  analogy  between  the  multifractal  analysis  and  wavelet  fractal  encod¬ 
ing  method  may  be  drawn.  The  the  object  instead  of  the  entire  image  is  the  attractor.  The  map¬ 
ping  between  scales  to  reproduce  a  given  object  is  simply  designed  to  minimize  distortion  in  the 
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reproduced  object  and  not  necessarily  reveal  the  underlying  physical  process.  However,  knowl¬ 
edge  of  this  technique  improves  encoder  performance  by  speeding  up  the  block  matching  process 
and  reducing  the  bandwidth  of  compressed  information  if  a  user  wants  only  a  specific  object  from 
a  scene  rather  than  all  the  objects  and  background. 

5.1.2.2  Modulus  Maxima  Gradient  Matching 

To  simplify  the  process  of  range  to  domain  block  matching  thus  finding  the  mapping  for 
Ql  in  equation  5.1.1  we  classify  the  range  and  domain  blocks  by  summing  the  blocks  modulus 
maxima  magnitude  and  angle  parameters  since  these  gradient  magnitude  and  angles  are  accurate 
indications  of  energy  and  direction  within  each  block.  Jaquin  used  a  similar  procedure  in  his  clas¬ 
sification  of  blocks  in  traditional  fractal  encoding.  by  applying  the  centroid  operator  to  each 
block.  Now  with  Mallat’s  wavelet  decomposition  energy  direction  is  already  indicated  as  a  natu¬ 
ral  part  of  the  process.  This  block  summation  procedure  is  described  for  both  angle  and  modulus 
values  as  is  shown  in  equations  5.2. 1  and  5.2.2.  Note  that  Norm  is  the  number  of  nonzero  modulus 
or  angle  values  in  a  block 


n  n 

2J  Norm 


n  n 

_  i/=  1  2 

ij 


(5.1.2) 


Norm 


(5.1.3) 
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We  then  apply  this  same  block  matching  procedure  to  the  possible  domain  blocks  at  each  scale. 


For  domain/range  block  matching  processes  we  use  the  difference  of  both  angle  and  modulus  val¬ 
ues  between  successive  scales  to  achieve  domain  to  range  block  matching.  The  block  pair  with 
lowest  combined  mdif  and  adif  are  selected  as  the  best  matched  blocks  rather  than  recursively 
differencing  every  pixel  in  every  possible  range  and  domain  block. 


mdif  = 


(5.1.4) 


adif  = 


(5.1.5) 


Thus  the  operation  of  block  classification  thus  becomes  a  lookup  table  procedure 
rather  than  an  exhaustive  matching  process.  Both  range  and  domain  block  position  in  the  image 
are  stored .  The  block  rotation  value  is  also  determined  by  applying  the  appropriate  flip  that  makes 
the  block  gradient  angles  match  most  closely. 

Until  recently  one  of  the  greatest  drawbacks  to  fractal  and  fractal  wavelet  compres¬ 
sion  has  been  overwhelming  number  of  computations  necessary  to  compute  the  range  to  domain 
block  mappings  since  the  process  was  performed  by  least  mean  square  differencing  for  every 
range  to  domain  block  pair  for  multiple  scales.  Our  matching  technique  dramatically  reduces  the 
number  of  calculations  to  compute  a  compressed  image.  The  number  of  calculations  is 

0(N^log(N))  (a  given  image  has  pixels)  which  is  a  significant  improvement  over  existing  frac¬ 
tal  wavelet  techniques^  which  can  be  as  high  as  0(N^log(N))  in  complexity.  Thus  this  method  is 
a  significant  improvement  over  the  Davis  and  Rinaldo/Cavagano^  methods  of  fractal  wavelet 


compression. 
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Ql  one  of  4  rotations  or  4  flips  Rl  +  decimation  by  2  &  spatial  shift 


Average  Range  Block  Gradient 
Angles  are  Computed  and  Compared 
With  Domain  Blocks  In  Lookup 
Table  for  Flip  Parameters 


Figure  5-4  Gradient  Angle  Matching 


5.1.3  Intensity  Offset  Determination 

In  order  to  determine  the  intensity  offset  or  b  parameter  from  Equation  5.1.1  between  two 
scales  we  find  the  range  and  domain  block  pairs  positions  determined  by  gradient  angle  matching 
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and  then  subtract  the  lowpass  blocks  which  correspond  to  these  positions.  Thus  within  a  given 
range  block  where  the  dimensions  of  the  domain  and  range  block  are  range  {x:xl<x<x2}  and 
{y:yl<y<y2}. 


X  -  x\  y  -  y\  ^  ^ 


(5.1.6) 


This  strategy  makes  sense  since  at  each  stage  of  the  reconstruction  process  we  are  using  a  given 
lowpass  image  to  reconstruct  the  information  in  the  next  highpass  band.  The  spatial  position 
information  via  range  to  domain  block  matching  is  computed  with  the  highpass  modulus  maxima 
bands  and  all  parameters  necessary  to  complete  the  calculation  of  equation  5.1.1  are  present. 
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b  is  an  intensity  offset  between  two  low  frequency  blocks 
Average  Sl2  "  Average  Slj 

b  =  I  I  -  I 

Range  Block  Domain  Block 

Figure  5-5  Dyadic  Maxima  Fractal  Block  Matching 

5.1.4  Zero-Tree  pruning 

Essentially  the  fractal  mapping  process  amounts  to  approximating  successive 
wavelet  frequency  scales  with  information  from  previous  scales.  This  decay  across  scales 
is  known  as  texture  in  the  sense  of  fractal  dimension.  Thus  as  was  stated  before  the  fractal 
mapping  constructs  its  texture  from  its  shape.  Shapiro  noticed  that  the  natural  decay  of  fre- 
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quency  between  scales  can  be  exploited  to  optimize  the  compression  process  by  eliminat¬ 
ing  wavelet  coefficients  that  fall  below  some  threshold. 

This  pruning  of  wavelet  coefficients  can  be  adapted  for  the  fractal  wavelet  com¬ 


pression  method.  We  can  show  that  using  the  modulus  maxima  to  determine  the  range  to  domain 
mapping  can  result  in  a  natural  quadtree  pruning  process.  Since  the  edge  detection  process  natu¬ 
rally  compresses  the  spatial  information  to  the  maxima  locations  and  these  maxima  are  in  turn 
thresholded,  blocks  with  maxima  energy  falling  below  a  certain  threshold  are  naturally  eliminated 

thus  improving  the  overall  compression  process.  Others  such  as  Davis  have  included  a  rate 
distortion  optimization  routine  for  this  process  although  our  technique  is  computationally  faster 
and  has  comparable  compression  performance  with  rate  distortion  optimized  code.  Once  Zero- 
Tree  pruning  is  performed  we  mn-length  encode  range  and  domain  information  at  each  scale. 


Figure  5-6  Dyadic  Maxima  Zero  Tree  Pruning 
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5.1.5  Color 

For  flexibility  in  compressed  domain  color  storage  and  compressed  domain  object  manip¬ 
ulation  we  transform  RGB  images  into  YRB  images  in  the  compressed  domain  corresponding  to 
the  follow  color  transformations.  For  luminance  we  have  the  transformation 

Y  =  0. 177/?  +  0.831  G  + 0.01 15  (5.1.7) 

To  return  from  luminance  to  RGB  we  have  T J 

G  =  0.1355+ 1.23  G-h  0.22/?  (5.1.8) 

Compressed  Domain  Decompressed 

_  Image 


I _ I 


Figure  5-7  Color  Image  Transformation 

Overall  we  decompose  the  image  into  three  separate  color  bands  and  each  one  of  which  has  a 
separate  multiresolution  modulus  maxima  analysis  performed  on  it.  An  overview  of  this  entire 
process  is  featured  in  Appendix  A. 
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5.2  Image  File  Organization 

With  the  great  amount  of  information  obtained  as  the  result  of  the  wavelet  analysis  process 
we  must  find  an  efficient  way  to  characterize  this  information  in  the  compressed  domain.  We  will 
now  develop  a  means  of  organizing  shape,  texture,  and  color  in  the  compressed  file  so  it  is  easily 
accessible  to  the  user  in  its  compressed  form. 

5.2.1  Object  Edge  Blocks  &  Signatures 

Because  the  range  blocks  are  chosen  basis  on  the  modulus  maxima  edge  formulation  and 
their  positions  are  stored  we  can  isolate  edge  blocks  from  interior  blocks  in  the  compressed 
domain.  From  these  edge  blocks  we  can  recover  the  shape  outline  of  each  object  through  a  tech¬ 
nique  known  as  a  signature.  A  signature  is  computed  by  finding  the  geometric  center  of  an  object 
and  then  making  a  plot  of  the  angle  vs.  the  distance  to  the  edge  from  the  geometric  center.  This 
technique  is  extremely  useful  since  it  is  invariant  to  scale  and  can  be  adjusted  for  rotation  in  2 

dimensions.  The  only  requirement  for  this  technique  is  that  it  have  reliable  edges  information. 

Figures  show  the  signature  matching  process.  Signatures  may  be  matched  by  simple  LMS  differ¬ 
encing  algorithms  or  more  sophisticated  correlation  algorithms  for  the  case  of  rotated  objects.  The 
following  figure  shows  the  signature  creation  and  matching  process.  Example  signatures  are 
shown  in  Appendix  A. 
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Signature  Graph  Constructed 


mag  ^  _ ^ 

-ife  4  180 

Angle  0  (deg) 

5-8  Signature  Creation  and  Matching 
5.2.2  Object  Texture  Feature  Vectors 

Not  only  can  the  modulus  maxima  shape  mask  be  used  to  restrict  domain 
block  search  but  it  can  also  be  used  to  segment  relevant  texture  information  about  an 
object.  To  define  the  texture  within  an  object  we  create  the  object  texture  mask  multiply¬ 
ing  a  binarized  copy  of  the  shape  mask  with  the  Holder  exponet  map  of  an  image.  The 
interior  of  this  mask  corresponds  to  the  profile  of  the  texture  within  a  given  object. 
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Object  Shape  Mask  Texture  Mask 


^  -► 
^  Yl3 

Holder 

Mlap 

Hm 

<y 

— 

# 

Xnyn 

5-9  Object  Texture  Mask 

Within  a  texture  mask  we  can  compute  the  average  texture  as  shown  below. 
This  is  known  as  the  texture  or  Holder  vector  for  that  object. 

N  N 


I  I  Hix,y) 

fj  _  ;c  =  Oy  =  0 

" - WTW - 


(5.2.1) 


5.2.3  Object  Color  Feature  Vectors 

Also  the  Red,  Green  and  Blue  partitions  of  a  color  image  may  also  be  segmented 
using  the  object  shape  mask  as  is  shown  below.  Thus  we  can  associate  color  with  given  objects  in 
an  image. 


Red  Mask 


Blue  Mask 

-  ¥ 


Figure  5-10  Color  Mask 
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Color  feature  vectors  V(j  and  the  associated  average  colors  {R,  G,  B)  within  an  objects  mask 
are  described  below; 


N  N 


1  I  R(x,y) 


n  _  JC  =  Ov  =  0 

N-N 


(5.2.4) 


Vc  =  {R,  G,  B)  (5.2.5) 

5.2.4  Object  Compositing  by  Feature  Vectors 

Quite  often  edges  that  define  objects  do  not  form  closed  curves  because  of  varia¬ 
tions  in  image  intensity,  occlusions  or  other  image  artifacts.  Thus  signatures  do  not 
always  reflect  the  true  shape  of  an  object.  To  combat  this  problem  we  can  group  objects 
together.  Objects  with  similar  feature  vectors  can  be  grouped  together  as  one  object  if  their  rela¬ 
tive  positions  are  near  each  other.  This  operation  can  be  user  defined  or  automatic  depending  on 
the  desired  information  in  the  composited  objects.  An  example  of  this  process  is  shown  in 


Appendix  A 
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Such  object  compositing  can  be  useful  for  providing  more  accurate  signature 
information  since  quite  often  there  are  gaps  in  the  chain  coded  block  edges  which  can  be  filled  in 
with  texture  and  color  information  as  is  shown  below. 


Query  Image 
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Figure  5-11  Object  Grouping 


5.2.5  Feature  Vector  Matching  Process 

To  compare  shape,  texture  and  color  of  objects  within  the  compressed  file 
we  need  simply  compare  the  associated  vectors.  This  greatly  increases  the  speed  of  com¬ 
pressed  object  database  matching.  The  matching  process  is  also  quite  straightforward 
since  vectors  can  be  matched  by  simple  LMS  differencing  or  correlation  in  the  case  of 
shape  signatures.  A  graphical  overview  of  the  matching  process  is  shown  in  the  following 
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figure  and  Appendix  A. 
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Figure  5-12  Signature  and  Vector  Matching  Process 

5.2.6  Stored  Image  File  Format 

Combining  the  reorganized  wavelet  fractal  compressed  file  information  we  have  he 
object  based  wavelet  fractal  file.  This  file  contains  all  the  information  necessary  to  interpret 
shape,  texture,  color,  position  and  orientation  of  objects  in  a  scene.  Images  can  be  rendered  pro¬ 
gressively,  objects  can  be  rendered  individually,  objects  can  be  rendered  in  black  and  white  or 
color,  and  every  object  has  shape,  color,  texture,  position,  and  orientation  information  associated 
with  it.  Thus  such  an  image  file  is  ideally  suited  for  content  based  inquiry  applications  as  well  as 
interactive  video  editing  and  in  an  extremely  low  bit  rate  environment.  Formulas  for  object 
moment  calculations  are  shown  in  Appendix  B.  An  example  compressed  image  file  is  shown  in 


Appendix  A 
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Figure  5-13  Image  File  Format 

5.3  Decompression 

Because  our  frequency  mapping  proceeds  from  low  to  high  frequency  our  reconstruction 
process  simply  proceeds  by  iterating  our  compressed  parameters  on  the  low  pass  image.  For  each 
scale  the  iterative  procedure  forms  the  approximation  of  the  next  higher  lowpass  image  and  is 
then  lowpass  filtered  and  the  process  is  repeated.  This  process  thus  removes  all  blocky  artifacts  in 
the  image  while  still  revealing  the  image  features.  The  above  technique  leads  to  a  direct  recon¬ 
struction  method  based  on  the  fast  decomposition  process  described  in  section  3.4.3 
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j=J 

while  (j  >  0) 

j=j-l 

endwhile 

The  stages  of  this  reconstruction  process  are  indicated  graphically  as  follows. 


5-14  Direct  Reconstruction  Technique 

Thus  in  the  compressed  file  our  first  step  is  to  upsample  compressed  lowpass  image  and  then  use 
the  stored  fractal  local  iterated  function  parameters  on  it  to  reconstruct  the  original  image  building 
each  new  scale  from  the  previous  lowpass  image.  The  image  can  be  restored  to  any  desired  resolu¬ 
tion  simply  by  stopping  the  reconstruction  process  at  a  given  scale.  This  iterative  procedure  is 
only  order  0(N)  where  N  is  the  number  of  image  blocks  for  all  scales  and  thus  can  be  performed 
real  time  with  no  special  hardware.  Thus  this  approach  is  a  significant  advance  over  the  Mallat 
alternating  projections  reconstruction  method  because  it  is  extremely  efficient  in  its  reconstruction 


speed.  The  reconstruction  stages  for  6  scales  are  shown  in  the  following  figure. 
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5-15  Multiresolution  Fractal  Wavelet  Reconstruction 
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Thus  our  compression  process  is  progressive  or  spatially  scalable  and  is  well  suited  to  network 
transmission.  For  low  bandwidth  systems  lower  resolution  copies  of  the  image  are  transmitted 
first  so  the  user  can  see  the  image  before  the  entire  file  is  transmitted.  This  fact  will  be  extremely 
useful  in  our  video  compression  technique.  We  can  see  in  the  decompressed  versions  of  Lena  that 
in  fact  most  block  artifacts  are  removed  from  the  image  and  we  have  a  very  high  quality  repro¬ 
duced  version  of  the  original  image.  A  graphical  overview  of  the  decompression  process  is  fea¬ 
tured  in  Appendix  A. 

5.3.1  Results 

Figures  5-16  through  5-17  show  the  test  result  of  transforming  the  images  of  Lena 
with  the  multiresolution  transform  in  comparison  with  other  popular  compression  methods.  In 
figure  5-18  we  see  that  the  multiresolution  technique  has  higher  PSNR  for  a  given  compression 
ratio  than  pure  wavelet.  Our  method  falls  slightly  below  zero-tree  compression  but  in  general  fol¬ 
lows  the  same  compression  numbers  for  Davis’s  spline  wavelet  fractal  approach.  At  low  com¬ 
pression  ratios  our  technique  is  equal  to  or  slightly  below  most  of  the  compression  techniques  in 
PSNR.  At  16: 1  our  wavelet  fractal  approach  surpasses  most  conventional  fractal  and  DCT  meth¬ 
ods  and  maintains  a  relatively  flat  PSNR  curve  where  other  methods  tend  to  decrease  in  reproduc¬ 
tive  quality  This  is  due  to  the  fact  that  by  filtering  at  each  decompressed  scale  we  remove 
artifacts  which  appear  in  linear  transform  techniques  as  well  as  fractal  techniques  when  the  num¬ 


ber  of  coefficients  or  fractal  blocks  decrease. 


Cap]  aNSd 
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Figure  5-18  Comparison  of  Compression  Ratio  vs.  PSNR  for  Several  Compression  Types 
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6.1  Object  Based  Video  Encoding 

As  a  result  of  the  image  compression  method  we  have  developed,  we  can  apply  the  same 
principles  to  object  oriented  video  coding.  Most  video  motion  estimation  techniques  rely  simply 
on  motion  as  a  way  to  define  objects  in  video.  However,  given  our  concept  of  shape-  to-  texture 
mapping  we  have  a  means  of  defining  a  video  frame  by  its  shape,  texture,  and  color  in  one  inte¬ 
grated  approach  and  then  using  optical  flow  to  define  object-based  motion. 

First  we  know  that  for  object-based  coding  to  work  effectively,  and  improve  over 
strictly  block-based  coding,  only  a  few  objects  in  the  scene  are  moving,  object  motion  is  domi¬ 
nant,  and  moderate,  the  moving  objects  cover  40-60%  of  the  image  area,  and  no  camera  motion 
occurs.  For  the  general  crossection  of  video  we  need  to  have  a  technique  which  can  handle  video 
favorable  and  unfavorable  to  object  video. 

6.2  Optical  Flow 

For  video  compression  we  have  an  effective  means  of  detecting  motion  of  objects 

in  the  video  stream.  Sundeswaran  has  shown  that  one  can  compute  optical  flow  from  Mallat’s 
edge  detected  imagery.  If  the  image  intensity  function  is  represented  by  I(x,y,t),  one  can  prove  that 
the  two  components  of  the  optical  flow  (Vx,Vy)  satisfy  the  optical  flow  constraint  equation  6.1.1. 
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At  a  fixed  time  t,  instead  of  solving  the  motion  constrain  equation  for  the  image 

I(x,y,t),  we  can  first  smooth  this  image  with  the  smoothing  function  0(x,y)  dilated  by  2^.  This 
reduced  the  computational  noise  when  estimating  partial  derivatives  with  finite  differences.  We 
thus  have  equation  6.1.2: 

This  equation  allows  us  to  recover  the  normal  component  of  the  flow  from  the 
wavelet  transform  at  the  scale  2j  Instead  of  computing  this  normal  component  at  all  points  (x,y) 
we  compute  it  only  at  the  locations  were  the  wavelet  transform  modulus  is  locally  maximum. 

This  saves  significantly  in  computational  complexity  over  traditional  optical  flow  computation 
techniques. 

Thus  for  each  scale  we  compute  the  optical  flow.  Since  each  dyadic  scale  has  a 
particular  block  size  associated  with  it  we  take  the  average  optical  flow  associated  with  each  NxN 
block  as  we  did  the  average  argument  and  modulus  maxima  as  is  described  by  the  following  equa¬ 
tions. 

N  N 

I 

V  zz  ^  =  Ov  =  0  (6.1.3) 

^  Norm 

N  N 

y  =  x  =  0y  =  0 _  (6.1.4) 

y  Norm 

Optical  flow  associated  with  small  blocks  is  associated  usually  with  background 
texture  motion  such  as  moving  water  or  camera  panning  where  optical  flow  associated  with  larger 
blocks  is  associated  with  object  motion 
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Figure  6-1  Video  Sequence  Coding 


6.2.1  Frame  Structure 

In  our  current  implementation  we  have  only  intra  coded  or  I  frames  and  forward 
predicted  or  P  frames.  Within  this  structure  I  frames  are  self  encoded  as  normal  image  frames  and 
optical  flow  is  computed  between  an  I  frame  and  sucessive  P  frames.  This  model  assumes  there  is 
no  significant  scene  changes  between  two  I  frames. 
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6.2.2  Background 

For  static  background  characterized  by  average  optical  flow  in  blocks  below  a  cer¬ 
tain  threshold  we  do  not  retransmit  these  blocks  between  frames.  Moving  background  this  is  sim¬ 
ply  treated  as  an  added  object.  Individual  block  motion  vectors  are  transmitted  to  update 
background  and  all  background  blocks  are  recoded  if  motion  is  above  a  given  threshold.  Since  this 
process  is  done  in  a  multiresolution  fashion,  smaller  detail  changes  may  be  added  to  update  fine 
texture  while  lower  resolution  information  remains  constant.  Uncovered  background  is  totally 
recoded  as  if  it  were  part  of  an  intra  frame,  as  are  3  dimensional  changes  in  objects  and  new 
objects  that  enter  a  scene  or  scene  changes.  Such  changes  are  characterized  by  optical  flow  above 
a  given  threshold  which  is  set  based  on  desired  video  quality  vs.  bit  rate  requirements. 

6.2.3  Moving  Objects 

As  presented  from  Chapter  4  we  define  shape  at  a  particular  scale  j  by  taking  the  chain 
coded  outline  of  an  object  at  a  particular  scale  and  forming  a  mask.  This  wavelet  defined  shape 
technique  is  simply  a  by-product  of  the  modulus  maxima  decomposition.  We  may  apply  this  same 
technique  to  the  optical  flow  information  firom  the  modulus  maxima  computation.  A  video  object 
mask  is  created  in  the  same  manner  as  other  objects.  A  binary  shape  mask  is  multiplied  by  the 
optical  flow  X  and  y  images  as  is  shown  below.  The  average  flow  within  this  object  thus  character- 
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izes  the  average  shape  based  motion  of  the  object. 


Xflow 


Yflow 


Fig  6-3  Motion  Object  Mask 

This  object  based  format  can  result  in  significant  bit  rate  improvements  over  transmitting 
individual  frames.  The  optical  flow  format,  by  concentrating  motion  into  different  frequency 
scales  does  an  excellent  job  of  characterizing  motion  which  might  occur  within  and  object.  For 
instance  the  external  gears  of  a  moving  train  would  be  characterized  by  one  scale  while  the  overall 
train  engine  itself  would  be  characterized  by  another.  Each  would  have  its  own  motion  but  one 
object  could  be  contained  within  the  other  at  a  different  scale. 

It  also  interesting  to  note  that  such  block  based  optical  flow  analysis  can  give  us  rotation 

and  three  dimensional  information  about  objects^'  In  a  rotating  object  the  edges  will  rotate  faster 
than  the  center  and  thus  depth  may  be  derived  via  motion  vectors.  Also  rotation  may  derived  with 
concentric  vectors  around  a  central  axis.  A  graphical  overview  of  the  video  compression  process 


is  shown  Appendix  A. 


Ill 

6.2.4  Object  Integrity 

In  order  to  determine  whether  an  object  has  changed  significantly  between  frames  so  it 
should  be  updated  we  can  use  the  associate  feature  vector  with  the  object.  If  the  overall  shape  of 
the  object  changes  we  can  compare  0th  moment  or  signature  profile.  If  the  angle  of  the  object  has 
changed  we  can  observe  the  2nd  moment  or  profile  of  the  flow  vectors  at  the  edges.  For  3-D  rota¬ 
tion  we  can  look  at  the  flow  at  the  edges  of  the  object.  If  internal  details  of  the  object  have 
changed  we  can  look  to  average  texture  information  or  flow  vectors  at  smaller  scales.  If  there  are 
significant  changes  the  object  is  totally  re-coded.  Thus  the  object  information  for  compressed 
domain  query  also  improves  the  motion  estimation  process.  Also  compositing  may  be  used  to 
improve  object  quality.  A  pictorial  example  of  this  process  is  shown  in  Appendices  D-2,  D-3, 
and  D-4. 

6.2.5  Video  Object  FUe 

The  video  object  file  has  the  same  capability  as  the  image  file  with  the  addition  of  an  opti¬ 
cal  flow  field  to  track  individual  object  blocks  for  motion  in  P  frames.  In  I  frames  all  range  and 
domain  block  information  is  transmitted.  As  in  the  image  case  three  separate  frames  are  encoded 
after  a  color  transformation  has  been  performed.  One  for  luminance,  one  for  the  red  frame  and 
one  for  the  blue  frame.  An  example  video  file  is  shown  in  Appendix  A. 

6.3  Decoder 

As  with  the  fractal  video  system  the  decoder  is  efficient,  needing  no  special  hard¬ 
ware  for  near  real-time  decoding.  Only  those  blocks  which  change  and  are  not  part  of 
existing  objects  need  to  be  retransmitted;  thus  the  decoder  only  needs  to  update  those 
blocks  that  completely  change.  Because  the  user  can  isolate  self  encoded  objects  these 
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can  be  transmitted  individually  so  that  video  editing  functions  may  be  performed  without 
special  human  intervention. 

6.4  Scalability 

This  intra  frame  process  to  a  new  set  of  scalabilities  in  image  transmission  as  shown  in 
Appendix  A  .  The  first  is  spatial  scalability  -  low  frequency  scales  are  transmitted  first  and  then 
higher  frequency  information  follows  as  new  scales  are  added.  Secondly  we  have  temporal  seal- 
ability  with  P  frames  being  skipped  of  the  video  stream  depending  on  the  speed  of  the  connection 
and  the  amount  of  temporal  resolution  needed.  Also  we  have  object  scalability  where  only  indi¬ 
vidual  objects  are  transmitted  or  just  their  motion  vectors.  Finally,  we  have  color  scalability 
where  black  and  white  luminance  information  or  full  color  can  be  transmitted  depending  on  user 
requirements. 

Appendix  A  show  how  the  above  system  can  be  integrated  into  a  distributed  networked 
content-based  query  system.  Thus  we  may  manipulate  individual  objects  by  shape,  color,  texture, 
and  motion  in  video  sequences.  We  may  database  search  on  individual  frames  and  post  process 
video  to  modify  content.  The  issue  of  how  to  design  a  network  to  optimize  this  search  process  is 
still  an  open  topic  but  the  evolution  of  such  tools  as  JAVA  for  object  oriented  programming  is 
greatly  increasing  the  ease  of  this  process. 

6.5  Results 

The  results  of  our  study  are  shown  as  follows.  We  attempted  to  maintain  the  same 
PSNR  within  the  taxi  sequence  shown  as  the  still  encoded  images  of  Lena.  In  this  compression 
study  motion  was  computed  simply  by  computing  the  flow  between  an  I  and  given  P  frame  where 
I  frames  occurred  every  5  frames.  Future  studies  will  examine  in  detail  object  oriented  motion 
compensated  video.  Factors  that  decrease  the  compression  ratio  achieved  include  camera  jitter. 
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photon  noise  within  pixels,  and  camera  non-uniformity.  Upcoming  studies  will  focus  on  removing 
such  image  distortions  to  obtain  even  higher  compression  and  lower  overall  picture  noise.  The 
prominent  spike  shown  in  these  curves  is  the  result  of  transition  from  an  intra  to  inter  coded  frame 
as  well  as  a  large  truck  entering  the  scene  and  a  car  turning  away  from  the  camera  in  3-D. 


Taxi  Sequence  Frame  1 


Taxi  Sequence  Frame  2 


Frame  1-2  Optical  Flow  X  Component 


Frame  1-2  Optical  Flow  Y  Component 


Reconstructed  Taxi  Frame  1 


Reconstructed  Taxi  Frame  2 


Figure  6-4  Taxi  Sequence 


Figure  6-5  PSNR  of  Video  Compression  at  Various  Compression  Ratios 
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Chapter  7 
Conclusion 

The  fractal-wavelet  method  offers  a  significant  improvement  over  existing  tech¬ 
niques  because  of  its  unified  approach  to  image  analysis  and  compression.  The  fractal  wavelet 
method  itself  gives  naturally  higher  compression  and  better  reproductive  quality  than  conven¬ 
tional  DCT-based  methods.  By  its  wavelet  frequency  division  process,  it  gives  a  more  natural 
organization  to  existing  fractal  methods  and  allows  more  accurate  block  matching.  As  a  result  of 
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its  modulus  maxima  shape  representation  it  gives  a  shape  to  texture  content-based  approach  to 
compressed  file  organization.  By  its  gradient  based  block  matching  technique  it  is  significantly 
faster  than  existing  wavelet-fractal  compression  methods.  Because  of  its  progressive  multiresolu¬ 
tion  frequency  decomposition  method  it  gives  faster  and  more  efficient  means  of  image  decom¬ 
pression  than  conventional  shape-based  wavelet  techniques.  And  by  efficient  organization  of 
content  via  shape,  texture,  color,  and  motion  it  produces  an  efficient  object-based  video  compres¬ 
sion  method. 

The  preceding  wavelet-fractal  method  for  image  compression  offers  a  new  frame¬ 
work  for  data  compression  and  analysis.  This  model  will  hopefully  serve  as  a  guide  both  to  future 
research  as  well  as  provide  a  useful  technique  for  image  and  video  compression  for  immediate 
needs.  It  holds  the  promise  for  even  higher  compression  ratios  with  greater  spatial  resolution  and 
increased  speed.  Also,  because  of  the  well  known  wavlet  fractal  signal  modelling  of  natural  phe- 
nonmena,  it  offers  the  possibility  of  linking  physical  modeling  to  existing  compression  tech¬ 
niques.  Thus,  the  wavelet  fractal  model  should  benefit  researchers  in  a  wide  range  of  disciplines. 
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A.  Image  and  Video  Analysis  Network  System  Overview 

With  the  unique  capabilities  of  the  wavelet  fractal  video  method  the  technique  lends  itself 
well  to  an  application  on  the  existing  World  Wide  Web.  The  high  compression  capability  coupled 
with  the  flexibility  of  object  based  video  and  pattern  matching  are  uniquely  suited  to  a  Web-based 


system.  Figure  A-1  shows  an  over  view  of  how  such  a  system  would  be  organized  on  the  web. 
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Figure  A-1  Network  Wavelet  Fractal  System 
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A.l  Clients 

There  are  essentially  three  client  applications  -  these  applications  are  exported  to  the  user  on  run 
time  of  a  Web  browser  currently  through  the  Java/HTML  programming  languages.  Once  estab¬ 
lished  on  the  client  these  applications  fulfill  three  roles.  The  feature  extraction  software  provides  a 
flexible  interface  for  users  to  read  in  an  image  from  any  server  on  the  web  and  convert  it  into  the 
elemental  feature  vectors  of  shape,  color,  and  texture  as  is  done  in  the  preprocessing  steps  of  the 
video.  In  this  way  ,  various  objects  in  any  image  format  such  as  GIF  and  JPEG  can  be  matched 
against  compressed  images  in  the  wavelet  fractal  database  for  comparison.  Secondly  the  image 
and  video  display  allows  users  to  display  compressed  imagery  and  video  in  complete  or  object 
form  either  from  stored  files  or  live  imagery.  The  decompression  technique  for  images  is  shown 
in  Figure  A-2. 
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Finally  the  pattern  matching  interface  allows  the  user  to  display  individual  objects  in  the  video 
scene  and  match  them  against  shape  color  and  texture  vectors  of  objects  in  the  compressed  video 
database  interactively  -  controlling  the  relative  weighting  of  the  shape,  color  and  texture  matching 
vectors  Objects  may  be  removed  or  added  as  is  shown  Figure  A-  3. 


Original  Objects 


Self  Encoded  Objects 


Figure  A-3  Self  Encoded  Object  Image 
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Once  the  desired  objects  are  selected  shape  information  such  as  the  signatures  of  Figure 
A-4  are  used  to  match  against  the  compressed  database  information  at  the  wavelet  fractal  database 
server  which  will  be  described  later.  Additionally  the  database  object  matching  interface  may  cre¬ 
ate  composite  images  either  automatically  or  with  user  assistance  as  is  shown  in  Figure  A-5 
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A.2  Servers 

Like  the  clients  there  are  three  servers.  Each  server  is  suited  to  its  particular  video  task. 
For  instance  the  image  and  video  compression  server  usually  consists  of  a  video  camera  with  spe¬ 
cial  purpose  hardware  designed  to  transmit  video  to  the  Java  Client.  The  Wavelet  Fractal  database 
sever  on  the  other  hand  is  usually  a  large  multiuser  computer  sometimes  with  many  processing 
nodes  designed  for  feature  vector  query. 

A2.1  Image  and  Video  Compression  Server 

The  image  and  video  compression  server  is  designed  to  transmit  java  software  to  clients 
and  to  compress  imagery  and  video.  In  many  cases  this  operational  can  be  contained  in  one  porta¬ 
ble  PC  so  that  many  sources  of  imagery  and  video  may  be  provide  live  from  multiple  locations  to 
the  client  software. 

A2.1.1  Image  Compression 

The  color  image  compression  operation  is  performed  on  the  server  as  shown  in  Figure  A- 
5.  The  basic  color  compression  compresses  Y,  R,  and  B  color  planes  separately  as  shown  in  a  mul¬ 
tiresolution  framework.  The  resulting  image  file  is  shown  in  Figure  A-7  This  content  indexable 
file  is  easily  referenced  by  the  database  matching  system. 
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A2.1.2  Video  Compression  System 

The  image  and  video  server  also  contains  the  video  compression  system.  This  the  video 
compression  process  is  depicted  in  Figure  A-8.  The  Y,R,  and  B  fields  are  computed  separately  and 
the  optical  flow  between  these  fields  is  used  to  determine  what  compressed  blocks  to  transmit. 
This  function  be  performed  non  real  time  or  real  time  with  a  camera  interface  and  special  purpose 
hardware.  A-9  shows  how  a  typical  video  sequence  is  broken  into  its  component  shape  color  and 
texture  fields  for  content  based  query  and  object  video.  This  process  is  then  illustrated  with  actual 
imagery  in  figures  A- 10  through  A-1 1.  Finally  in  Figure  A.- 12  the  compressed  file  format  is  illus¬ 
trated.  What  is  clear  in  the  image  compression  process  is  that  there  is  a  great  deal  of  flexibility 
with  the  wavelet  fractal  technique  in  terms  of  adapting  to  a  specific  network  bandwidth.  The  spa¬ 
tial,  temporal,  object,  and  color  scalable  features  of  this  compression  system  make  it  practical  for 


use  on  the  low  bandwidth  World  Wide  Web. 
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A3  Wavelet  Fractal  Database  Server 

The  wavelet  fractal  database  server  servers  two  roles.  First  it  is  a  repository  of  data  in  the 
wavlet  fractal  compressed  data  format.  Secondly,  it  performs  searches  across  vast  numbers  of 
compressed  files  very  efficiently  for  object  and  image  matches.  This  role  is  nicely  filled  by  some 
of  the  large  supercomputer  architectures  with  multiple  processors,  large  network  capacity,  and 
large  disk  capacity.  The  process  of  creating  and  submitting  a  query  with  either  the  feature  or  data¬ 
base  matching  interfaces  is  shown  in  Figures  A- 14.  and  A15  .  Many  methods  for  matching  vectors 
may  be  considered  including  neural  networks. 
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A- 14  Compressed  Domain  Object  Matching 
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B  Object  Moments 

Simple  shapes  can  be  characterized  by  zeroth,  first,  and  second  moments.  Zeroth  moment 
is  the  area  in  pixels  occupied  by  an  object.  If  I(x,y)  is  a  luminance  image  with  pixels  of  coordi¬ 
nates  x,y  then  the  zeroth  moment  is. 

N  N 

MO  =  J^Iix,y)  (8.1) 

X  =  Oy  =  0 

First  moment  is  the  intensity  weighted  average  in  the  x  and  y  directions  of  an  object  also 
known  as  the  center  of  mass  or  centroid  of  an  object.  This  information  is  necessary  for  determin¬ 
ing  signature  profile  and  can  also  be  used  to  track  objects  in  motion  video  as  is  discussed  in  the 
next  chapter. 

N-iN-l 

Mix  =  ^  =  =  Q  ^ -  (8-2) 

MO 

E  X /(a:, )’)■)- 

Miy  =  -  (8.3) 

Finally,  second  moment  gives  the  angle  0  of  the  primary  axis  of  orientation  of  the  object.  This 
information  can  be  very  useful  in  the  signature  matching  process  for  lining  up  signatures  of  the 
same  objects  that  has  been  rotated. 

N-lN-l 

M2a  =  X  S  iMlx-xfl{x,y) 

X  =  Oy  =  0 


(8.4) 
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M2b  = 


M2c 


N-lN-l 

z  {Mix- x){Mly -y)I{x,y)  (8.5) 

=  Oy  =  0 


=  z  z  {M\y-yf'I{x,y)  (8.6) 

;c  =  Oy  =  0 


tan29  = 


M2b 

M2a-M2c 


(8.7) 
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